<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Assessing geographic data usability in analytical contexts by using sensitivity analyses of geospatial processes.</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Robin Frew</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p> Do different methods of representing real-world features have an effect on the findings from GIS analyses?  To what extent does choice of data sources affect network analysis?  In considering network accessibility, are results affected by the representation of supply and demand considerations? 1 Copyright © by the paper's authors. Copying permitted for private and academic purposes. In: A. Comber, B. Bucher, S. Ivanovic (eds.): Proceedings of the 3rd AGILE PhD School, Champs sur Marne, France, 15-17-September-2015, published at http://ceur-ws.org</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The number and variety of sources of spatial data continues to expand, as do the
debates regarding the quality and usability of such data, particularly those which
are Free and Open Source (FOS) or free-to-use. The highest quality data is often
expensive to obtain and the option of cost-free data sets is tempting for many
users.</p>
      <p>
        With the existence of the huge hinterland of data quality research acknowledged,
and a great number of studies investigating the usability of devices and interfaces,
little attention has been paid to the usability of data, and even less into the
usability of geographical data in typical GIS research situations. There has been some
research into the use of volunteered geographic information (VGI) in the field of
data quality theory and assessment
        <xref ref-type="bibr" rid="ref1 ref7">(see for example Haklay, 2010; Zielstra and
Zipf, 2010)</xref>
        , but relatively few studies have incorporated sensitivity analysis
involving the application of different sources of spatial data to a range of GIS tasks.
        <xref ref-type="bibr" rid="ref3">Jones’s (2010</xref>
        ) study into the use of open data in presenting and visualising public
health information is one notable exception, with another being that of
        <xref ref-type="bibr" rid="ref2">Higgs et
al's (2012</xref>
        ) examination of the impact of alternative approaches to measuring
accessibility to green space.
      </p>
      <p>This study set out to address cross-cutting themes that are topical in GIS and
geographical analysis given trends towards the use of open source data, namely:
There is little evidence to date on which to quantify the effects of these issues on
final results. This research is intended to take a step in redressing gaps that exist
in the knowledge, understanding and perception of such data.</p>
      <p>This study argues that even the best quality data may not be appropriate in certain
contexts. To highlight the type of scenarios where this may indeed be the case
several commercial and free-to-use data sources were used in sensitivity analyses
of the application of well-established GIS network analysis tasks. The aim is to
assess whether findings vary according to the application of alternate data sets
used to represent the same features within such models.</p>
      <p>The research took the form of various case studies, all tied around a similar theme,
that of accessibility. Some of the studies assessed accessibility to features that
have been the subject of much research in the past (such as GP surgeries), while
some looked at less commonly assessed features (such as primary schools,
secondary schools and sports facilities). All were linked by an interest in various
health and fitness initiatives and investigations that have taken place in South
Wales (UK), such as those looking at active travel to schools, equitable access to
health care and reasonable geographical access to sport and leisure facilities.
The part of the study relating to accessibility to primary schools will be used as an
example.</p>
      <p>The supply feature (primary schools) were represented in four different ways by
the two datasets examined: a Point of Interest1 point (nominally the centroid of the
main school building); the pedestrian access points of each school; the geometric
centroid of the entire school site (including play areas, sports fields and car parks);
and the site perimeter. The Ordnance Survey Sites dataset2, by providing the
footprint of each school as well as the access points, offered more detail and precision
to measurements of access, raising another interesting question as to whether any
increase in precision automatically resulted in an increased accuracy of results.
The places of origin for journeys to the schools were kept constant, and were UK
census Output Area population-weighted centroids (the smallest unit of published
UK census data).</p>
      <p>Distances from each population centroid were measured to the nearest school,
looking at each representation in turn, using the various network datasets. The
network datasets included commercial data (Ordnance Survey ITN and ITN with
Urban Paths3), free-to-use data (Ordnance Survey OpenRoads4) and FOS data
1
https://www.ordnancesurvey.co.uk/business-and-government/products/pointsof-interest.html</p>
      <p>
        2https://www.ordnancesurvey.co.uk/business-andgovernment/products/topography-layer.html
3https://www.ordnancesurvey.co.uk/business-and-government/products/itnlayer.html
(OpenStreetMap5). Sensitivity analysis was conducted through repetitions of the
distance calculations, ensuring every combination of network (plus Euclidean
measurement) was used for every feature representation. The process was then
repeated in its entirety using a Two-Step Floating Catchment Area (2SFCA)
measurement. As described by
        <xref ref-type="bibr" rid="ref5">Luo and Wang (2003)</xref>
        , 2SFCA incorporates levels
of supply and demand by calculating population-to-provider ratios for each supply
centre within a defined threshold distance, then identifying all those supply centres
within the same threshold distance of each demand centre, and summing all their
ratios for each population. Supply was represented by the student capacity of each
school (the school 'roll', from figures published by the local authority). Demand
was represented by the number of primary school-age children in each census area
(as extracted from published 2011 census data).
      </p>
      <p>The large number of results generated were cross-compared. The comparison
revealed that for primary schools the vast majority of results (over 80% of all
comparisons) were statistically significantly different from the others at the &lt; .001
level, for both distance and 2SFCA measures. This indicated that the different
datasets used were not interchangeable and therefore not equally usable in this
type of study.</p>
      <p>At this early stage of analysis initial indications were that differences between the
network datasets had the greater effect on results. Differences due to method of
demand- or supply-side feature representation were less important.
Initial findings suggest that more attention needs to be given to the nature of data
sets used to represent such features in GIS-based analytical tasks. The exact
context in which such data sets are applied may determine how usable different
sources of data are in relation to common GIS spatial analytical tasks and a useful
addition to GIS-based analysis going forward could be the derivation of a
typology of circumstances in which adopting alternative sources of open data are more
appropriate.</p>
      <p>Acknowledgments
This research was funded by Ordnance Survey (OS) but any interpretations of
findings are those of the student and do not necessarily reflect the opinions of OS.
4https://www.ordnancesurvey.co.uk/business-and-government/products/os-openroads.html
5http://www.openstreetmap.org/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Haklay</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2010</year>
          ) '
          <article-title>How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets'</article-title>
          ,
          <source>Environment and Planning B: Planning and Design</source>
          ,
          <volume>37</volume>
          , pp.
          <fpage>682</fpage>
          -
          <lpage>703</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Higgs</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fry</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Langford</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2012</year>
          )
          <article-title>'Investigating the implications of using alternative GIS-based techniques to measure accessibility to green space'</article-title>
          ,
          <source>Environment and Planning B: Planning and Design</source>
          ,
          <volume>39</volume>
          , pp.
          <fpage>326</fpage>
          -
          <lpage>343</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2010</year>
          ) '
          <article-title>Open geographical data, visualisation and dissemination in public health information'</article-title>
          ,
          <source>AGI Geocommunity</source>
          <year>2010</year>
          [Online]. Available at: http://www.agi.org.uk/storage/geocommunity/presentations/SamuelJones.pdf
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>(Accessed: 5 February</source>
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , (
          <year>2003</year>
          )
          <article-title>'Measures of spatial accessibility to health care in a GIS environment: synthesis and a case study in the Chicago region',</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Environment and Planning B: Planning</surname>
          </string-name>
          and Design,
          <volume>30</volume>
          , pp.
          <fpage>865</fpage>
          -
          <lpage>884</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Zielstra</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Zipf</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2010</year>
          )
          <article-title>'A comparative study of proprietary geodata and volunteered geographic information for Germany'</article-title>
          ,
          <source>13th AGILE International Conference on Geographic Information Science</source>
          , Guimarães, Portugal. [Online]. Available at http://agile2010.dsi.uminho.pt/pen/ shortpapers_pdf/142_doc.
          <source>pdf (Accessed: 18 April</source>
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>