=Paper= {{Paper |id=Vol-2084/shortplus8 |storemode=property |title=Geocoding, Publishing, and Using Historical Places and Old Maps in Linked Data Applications |pdfUrl=https://ceur-ws.org/Vol-2084/shortplus8.pdf |volume=Vol-2084 |authors=Esko Ikkala,Eero Hyvönen,Jouni Tuominen |dblpUrl=https://dblp.org/rec/conf/dhn/IkkalaHT18 }} ==Geocoding, Publishing, and Using Historical Places and Old Maps in Linked Data Applications== https://ceur-ws.org/Vol-2084/shortplus8.pdf
Geocoding, Publishing, and Using Historical Places and
       Old Maps in Linked Data Applications

                    Esko Ikkala1 , Eero Hyvönen1,2 , and Jouni Tuominen1,2
              1
               Semantic Computing Research Group (SeCo), Aalto University, Finland
       2
           HELDIG – Helsinki Centre for Digital Humanities, University of Helsinki, Finland
              http://seco.cs.aalto.fi/projects/histoplaces/en/
                            firstname.lastname@aalto.fi



           Abstract. This paper presents a Linked Open Data brokering service prototype
           Hipla.fi for using and maintaining historical place gazetteers and maps based on
           distributed SPARQL endpoints. The service introduces several novelties: First,
           the service facilitates collaborative maintenance of geo-ontologies and maps in
           real time as a side effect of annotating contents in legacy cataloging systems. The
           idea is to support a collaborative ecosystem of curators that creates and maintains
           data about historical places and maps in a sustainable way. Second, in order to fos-
           ter understanding of historical places, the places can be provided on both modern
           and historical maps, and with additional contextual Linked Data attached. Third,
           since data about historical places is typically maintained by different authorities
           and in different countries, the service can be used and extended in a federated
           fashion, by including new distributed SPARQL endpoints (or other web services
           with a suitable API) into the system.


Keywords: historical place, old map, linked data, crowdsourcing, geocoding


1      Relating Historical Information to Geographic Locations

Historical documents and content include references to historical places that provide an
essential context for the data. However, historical places cannot necessarily be found on
modern maps and gazetteers, but only on old maps from a matching time period. Deal-
ing with historical geographical places and gazetteers3 [9] adds a temporal dimension
and the notion of change to Geographic Information Systems (GIS). Many, if not most,
historical places, such as Carthago or Czechoslovakia, do not exist anymore on modern
maps or have at least changed substantially over the time.
    Linked Data publishing principles [3] and geospatial place ontologies [1] are be-
coming popular in georeferencing [5], i.e., in relating information to geographic loca-
tions in information sciences. Ontologies define classes and individuals for representing
geographic regions, their properties, and mutual topological and other relationships. In-
teroperability of dataset contents in terms of geographical places can be fostered by
 3
     A gazetteer is a geographical dictionary or directory used in conjunction with a map or an
     atlas.
sharing place resource URIs in different applications, preferably already when cata-
loging and annotating data.
    To facilitate geographic information retrieval, data analysis, and visualization of
historical data, old placenames on old maps need to be geocoded. This paper presents
a solution to this with a prototype implementation supporting crowdsourced placename
geocoding as Linked Data. A public service4 was established, integrated with Map
Warper5 , an open source map georectifying tool developed at the Public Library of
New York. New place instances can be compared with existing ones in the underlying
Linked Data repository (ontology) to foster reuse and in order to prevent creation of
multiple instances of the same place. Metadata about the maps is stored in a Linked
Data repository in similar way to places, which facilitates using maps in applications
via a SPARQL endpoint.
    As a pilot use case, we show how the Hipla.fi data service has been applied in
creating a semantic portal for Second World War Data [8] dealing with places in pre-
war and contemporary Finland.


2      Prototype Implementation: Hipla.fi

In this section we show how the Hipla.fi service is used in practice. Fig. 1 depicts the
user interface, providing the end user with the following functionalities:
    Searching places For finding, disambiguating, and examining historical places,
there is an autocompletion search input field (a). By using the checkboxes above (b)
the user can select which datasets (e.g., TGN, Suggested New Places) are included in
the search results. The results are grouped based on their dataset, and they can be ex-
amined as follows:

 1. Hovering the cursor over the search results shows where the places are, the corre-
    sponding marker bounces on the map.
 2. Clicking a search result label or the corresponding map marker opens the info win-
    dow of the place, showing its context (c).
 3. Clicking the menu button on a result row (a) shows the place data in a Linked Data
    browser for investigating the data in detail.

    Multiple dataset browsing If the user does not know the name of the place, but
she has some idea where the place is located, she can pan and zoom the map view to
the area. After this it’s possible to use “View all places on current map view” button
next to (b) on the left. This way places from different datasets connected to Hipla.fi are
rendered on the map, and the user can check if the place exists already in some of the
datasets. Places from different datasets are dataset-wise color-coded, which makes it
possible to compare places in different gazetteers.
    View on historical maps The ”Maps” (b) tab provides a list of old maps that in-
tersect the current map view. The map images are fetched from Hipla.fi’s Map Warper
 4
     http://hipla.fi
 5
     https://github.com/timwaters/mapwarper
                                      Fig. 1. Hipla.fi user interface.


georectifying service6 and their metadata is queried with SPARQL from the map RDF
graph of the Hipla.fi service. Each map has a checkbox for rendering the map on the
main map view, a thumbnail image, information about map series, scale and type, and a
link to view the map in Map Warper. All map series are visible by default, but with the
map series button it is possible to filter maps series-wise. Once one or more historical
maps have been selected with the checkboxes for viewing, the opacity of the historical
maps can be adjusted with the slider that is located on the top right corner of the map.
If the user pans or zooms the main map view, clicking on the ”Refresh map list” button
updates the map list.
     View contextual data When the user selects a place, the resource can be browsed
using the Linked Data browser SAHA7 to see its detailed structure. Furthermore, con-
textual data (c) is provided connecting the place to other relevant data sources using an
infobox.
     Suggesting new placenames If the place at hand does not exist in any of the datasets
connected to HIPLA, the user can submit a place suggestion by clicking the ”Add a
new place” button and filling the place details form. Coordinates for the new place
suggestion can be selected from the Google map view, and it is possible to use historical
map sheets for setting the coordinates. Finally the user must select the target dataset
for the place suggestion. After the ”Save changes” button is clicked, the new place
suggestion is available for all the users of the service. This mechanism prevents the
creation of duplicate place suggestions entries.
     New datasets can be added to the Hipla.fi service by providing their configuration
to the system. The needed information include 1) the SPARQL endpoint URL, 2) a
SPARQL query for the autocompletion search, and 3) a HTML template for rendering
a SPARQL result in the autocompleted result list. In addition, another SPARQL query
and a HTML template can be supplied for providing contextual data for the user when
a place is selected.
 6
     http://mapwarper.onki.fi
 7
     http://seco.cs.aalto.fi/services/saha/
    The system was implemented using the Linked Data Finland platform8 [7], based
on Fuseki9 with a Varnish Cache10 front end for serving the Linked Data. The end-
user interface of Hipla.fi is a lightweight HTML5 single page map application, which
provides access to multiple data sources with SPARQL queries and autocomplete search
functionality using typeahead.js11 . Embedded Google Maps view is used to visualize
historical places.


3    Application Case: An Ontology of World War II Places

This section presents an application of the Hipla.fi prototype in the WarSampo Portal12 ,
a system for publishing collections of heterogeneous, distributed data about the Second
World War on the Semantic Web. The WarSampo Portal allows both historians and
laymen to study war history and destinies of their family members in war from different
interlinked perspectives.
    The war zone between Finland and the Soviet Union during the WW2 was an-
nexed to the Soviet Union after the war, and moderns maps have only Soviet or Russian
names, making it impossible to use modern gazetteers to describe primary source data
of the war, such as photographs, articles, war diaries, etc., in which original Finnish
placenames are used. To provide the missing target ontology for named entity linking
of WW2 related materials, a historical geo-ontology of placenames and maps covering
the war years 1939–1945 was created.
    The ontology was built by combining and populating the Hipla.fi service with six
data sources: 1) National Archives of Finland’s map application data of 612 wartime
municipalities, 2) the Finnish Spatio-Temporal Ontology describing the regions of the
Finnish municipalities in different times13 , 3) a dataset of geocoded Karelian map
names (34,000 map names with coordinates and place types), 4) the current Finnish
Geographic Names Registry (800,000 places), 5) Historical Senate atlas (ca. 1900), and
6) Karelian maps (1928–1951).
    Named entity linking of placenames was used to automatically link [4] 160 000
photo captions, over 1000 principal event descriptions, 95 000 death records, 4500 war
prisoner records, and 3400 magazine articles to geographic locations. The resulting
data is available as 5-star Linked Open Data at the Linked Data Finland service14 , with
content negotiation, a SPARQL endpoint, and additional services for reusing the data.
    Using the automatically generated links it was possible to build the WarSampo
Places Perspective15 for viewing WarSampo contents on both modern and historical
maps. The Places Perspective was implemented by re-using Hipla.fi user interface com-
ponents.
 8
   http://ldf.fi
 9
   http://jena.apache.org/documentation/serving data/
10
   https://www.varnish-cache.org
11
   http://twitter.github.io/typeahead.js/
12
   https://www.sotasampo.fi/en
13
   http://seco.cs.aalto.fi/ontologies/sapo/
14
   http://www.ldf.fi/dataset/warsa
15
   https://www.sotasampo.fi/en/places/
4    Related Work and Discussion


This paper presented Hipla.fi, a service for brokering historical places from distributed
Linked Data gazetteers on historical and contemporary maps. There are several gazetteers
of historical places on the web, such as The Historical Gazetteer of England’s Place-
names16 , Gazetteer for Scotland17 , the Danish service DigDag18 for finding historical
administrative areas with polygons on maps, the Dutch services Gemeentegeschiede-
nis.nl19 and Histopo.nl20 , and the Alexandria Digital Library Gazetteer [6].
    Thesauri of historical places, published as Linked Data, include the Getty TGN
of some 1.5 million records and Pleiades21 [2] for ancient places. Pelagios projects22
develop APIs and GUIs for multiple historical gazetteers, such as Pleiades. DBpedia23
contains masses of Linked Data of historical and contemporary places while GeoNames
focuses on modern places. VIAF24 brokers mutually aligned authority files, including
historical placenames, from various national libraries around the world in Linked Data
form, and from some additional open data sources, such as DBpedia and Wikidata.
    The big challenge when working with placenames is that they are highly ambiguous
(polysemy). There can be dozens or even hundreds of places around Finland with the
same name, which presents a serious challenge for, e.g., automatic linking of events to
places based on the description texts of events. Utilizing place type information is one
partial solution to this problem. For example when linking the placename references in
WarSampo datasets to resources in the place ontology the following order of priority
was used: 1) municipality, 2) town, 3) village, 4) body of water. House names were
most ambiguous, and they were not used in automatic linking. They would however be
useful, if the linking is made by manually.
    Another major difficulty has been that different geographic data sources, such as
maps used as the basis for geocoding, are overlapping, producing multiple instances
of same places. A partial solution to this issue was to remove duplicate placenames in
advance, when two places shared a name, were close to each other, and had the same
place type. However, there remain cases where it is not possible to differentiate between
multiple placenames without manual work.
    These challenges indicate that it is important to support both manual and automatic
geocoding. The Hipla.fi service combines different geographic data sources into a uni-
fied view, which enables efficient search and comparison of possibly overlapping data
sources.

16
   http://www.placenames.org.uk
17
   http://www.scottish-places.info
18
   http://www.digdag.dk
19
   http://www.gemeentegeschiedenis.nl
20
   http://histopo.nl
21
   http://pleiades.stoa.org
22
   http://commons.pelagios.org
23
   http://www.dbpedia.org
24
   http://viaf.org
Acknowledgements Hanna Hyvönen rectified Hipla.fi maps and Eetu Mäkelä con-
tributed in creating gazetteers. Our work was supported by the Finnish Cultural Foun-
dation and the Wikidata Project of Wikimedia Finland.


References
1. Ashish, N., Sheth, A. (eds.): Geospatial Semantics and Semantic Web: Foundations, Algo-
   rithms, and Applications. Springer–Verlag (2011)
2. Elliott, T., Gillies, S.: Digital geography and classics. Digital Humanities Quarterly 3(1)
   (2009)
3. Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space (1st edition).
   Synthesis Lectures on the Semantic Web: Theory and Technology, Morgan & Claypool (2011)
4. Heino, E., Tamper, M., Mäkelä, E., Leskinen, P., Ikkala, E., Tuominen, J., Koho, M., Hyvönen,
   E.: Named entity linking in a complex domain: Case second world war history. In: Pro-
   ceedings, Language, Technology and Knowledge (LDK 2017). pp. 120–133. Springer-Verlag
   (2017)
5. Hill, L.: Georeferencing: The geographic associations of information. MIT Press (2009)
6. Hill, L., Frew, J., Zheng, Q.: Geographic names: The implementation of a gazetteer in a geo-
   referenced digital library. D-Lib 5(1) (1999)
7. Hyvönen, E., Tuominen, J., Alonen, M., Mäkelä, E.: Linked Data Finland: A 7-star model
   and platform for publishing and re-using linked datasets. In: The Semantic Web: ESWC 2014
   Satellite Events, Revised Selected Papers. pp. 226–230. Springer–Verlag (2014)
8. Hyvönen, E., Heino, E., Leskinen, P., Ikkala, E., Koho, M., Tamper, M., Tuominen, J., Mäkelä,
   E.: WarSampo data service and semantic portal for publishing linked open data about the
   second world war history. In: Proc. of ESWC 2016. Springer–Verlag (2016)
9. Southall, H., Mostern, R., Berman, M.L.: On historical gazetteers. International Journal of
   Humanities and Arts Computing 5(2), 127–145 (2011)