Reassembling the Lives of Finnish Prisoners of the Second World War on the Semantic Web Mikko Koho1 , Esko Ikkala1 , Eero Hyvönen1,2 Semantic Computing Research Group (SeCo) 1 Aalto University, Finland 2 HELDIG – Helsinki Centre for Digital Humanities University of Helsinki, Finland http://seco.cs.aalto.fi/, firstname.lastname@aalto.fi Abstract This paper presents the first results of a new, ninth application perspective for the semantic portal WarSampo – Finnish WW2 on the Semantic Web, based on a database of ca. 4 450 Finnish prisoners of war in the Soviet Union. Our key idea is to reassemble the life of each prisoner of war by using Linked Data, based on information about the person in different data sources. Using the enriched aggregated data, a biographical global “home page” for each prisoner of war can be created, that is more complete than information in individual data sources. The application perspective is targeted to the researchers of military history, to study and analyze the data in order to form new research questions or hypotheses, as well as to public in the large looking for information, e.g., about their relatives that were captured as prisoners of war. Employing the faceted search of the application perspective, prosopographical research on subgroups of prisoners is possible. 1 Introduction casualties of war2 database of the National Archives of Finland. The new application perspective enables study- Representing biographical texts as Linked Data leads to ing not only individuals but also prosopographical stud- a paradigm change in publishing biographical collections ies of the prisoners using either the whole dataset or sub- (Hyvönen et al., 2019): the lives can then not only be read sets of it based on user interest and selections in a faceted as texts by humans but also be processed and analyzed search (Tunkelang, 2009) view. by computational means (Fokkens et al., 2017; Warren et The new prisoners of war dataset was originally published al., 2016), opening new possibilities in Digital Humani- as a book (Alava et al., 2003). For integrating and pub- ties (Gardiner and Musto, 2015) research for biography and lishing the data as a part of WarSampo, it has been further prosopography (Verboven et al., 2007) as well as for data extended, cleaned, and validated by domain experts using, reuse in applications. The same idea of Linked Data can e.g., information from many war-time archives in Finland be applied also when biographical data is available in semi- and Russia. This paper builds on previous work on War- structured or structured form from different data sources: Sampo, which has discussed the Linked Data publication the data about a person can be aggregated, harmonized, and data model (Koho et al., 2018a), and the data integra- and reassembled into a global knowledge graph that gives tion challenges (Koho et al., 2018b). Reconstructing the a more complete picture of the biographee than any indi- biographies of the casualties of war in WarSampo has been vidual source alone. Based on the knowledge graph, a bi- previously presented in (Koho et al., 2017). In contrast to ography of the biographee can be generated or alternatively the casualties of war dataset, the POW register can have a semi-structured “home page” presenting her/his life. The multiple values for a single property, and contains sources latter approach was introduced in the semantic portal War- of information for individual data values, creating a need Sampo – Finnish WW2 on the Semantic Web1 (Hyvönen et for handling conflicting information about a person. al., 2016), a web service in use in Finland that had 230 000 In the following, the underlying data model and data pro- users in 2018, typically looking for information about their duction process is first explained. After this, the main func- relatives killed in action during the Second World War tionalities of the application from an end user perspective (WW2). are explained, as well as the technical implementation. In This paper presents a new, ninth application perspective conclusion, the contributions of the work are summarized Prisoners of War to be included in WarSampo. This per- and contrasted with related work. spective was created for studying individual people, docu- mented in a new prisoners of war (POW) database, as well 2 Data Model and Data as groups of them for prosopographical analysis. The new The prisoners dataset consists mainly of a register of the data was aligned with and integrated into the WarSampo Finnish prisoners of war in WW2, containing a spreadsheet person data, which is mostly based on the Finnish WW2 of about 4 450 soldiers, auxiliary forces, and civilians cap- tured by the army of the Soviet Union. Additional spread- 1 This semantic portal was released in 2015 and is in use at sheets contain information about POW camps and hospi- https://sotasampo.fi/en/. More information about the tals, as well as the primary data sources. The data includes project is available at home page https://seco.cs.aalto. 2 fi/projects/sotasampo/en/. http://kronos.narc.fi/menehtyneet/ Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). also separate documents about the prisoners of war to pro- In addition, this data is then used to create CIDOC CRM vide additional information, such as video interviews, im- descriptions of the actual people and events, when appropri- ages and archived documents. ate. WarSampo person instances (Leskinen et al., 2017) in The original information sources are mostly various reg- the actor ontology are enriched using the prisoner records. isters in Finnish and Russian archives (Alava et al., 2003). New person instances are created for people that do not al- Information in different sources can be contradictory, hence ready exist in the ontology, which is the case for most of the it is important to preserve the data source for each individ- war prisoners. The prisoner records then document the per- ual piece of information. A formatting was agreed upon son instance through the CRM property P70 documents. to allow multiple values with source information already The full WarSampo data model is published on GitHub5 . in the original spreadsheet that the domain experts worked on. The data formatting evolved as a collaboration between 22 Data Conversion the domain experts maintaining the original dataset, and It has been understood from our previous work, that the the WarSampo team of Linked Data experts. Also other data transformations need to be repeatable, automated pro- agreements on the spreadsheet structure were needed: 1) cesses (Koho et al., 2018b), in the dynamic infrastructure separation and cleaning of values that will be linked to the where there is frequently a need to adapt to changes. An WarSampo domain ontologies, 2) local identifiers for enti- automatic data processing pipeline6 was developed to inte- ties that are used in multiple spreadsheets, and 3) how to grate the POW data into WarSampo linked data infrastruc- express partially or completely missing information. ture. The pipeline handles data transformation, validation, The WarSampo infrastructure, data service, and semantic linking, and harmonization. portal was chosen as the primary data publication platform The pipeline transforms the spreadsheets into RDF, map- by the stakeholders, which include the National Archives ping the spreadsheet columns to RDF properties, with pos- of Finland, and the Association for Cherishing the Memory sibly multiple values per property, and containing annota- of the Dead of the War. tions for primary information sources. Automatic proba- bilistic entity linking processes then link the records to the 21 Prisoners of War as Linked Data WarSampo domain ontologies of military ranks, units, oc- The WarSampo Linked Open Data infrastructure is built cupations, people, and places. Original literal values are to support integrating new datasets into WarSampo, by ex- also retained as separate properties. tending both the data model and the data content. The data The original POW register is maintained in spreadsheet for- is published openly online for everyone to use. The War- mat, which can be easily integrated into WarSampo with Sampo web portal then provides different perspectives to our automated transformation process when the spread- the interlinked datasets, as customized web applications. sheet is updated, provided that the structure stays the same. New perspectives can be added to provide views to new Also if the linked domain ontologies are updated, the datasets, or to show new features of the existing data. whole integration process can be redone to account for the changes in the probabilistic entity linking. In Linked Data (Heath and Bizer, 2011), information is pre- The cell formatting is validated during the data transforma- sented as RDF graphs and all resources in the data have tion process. Also other simple data validation rules are unique identifiers. This enables identifying and sharing applied to find anomalies during data conversions. The val- common resources, e.g. people, places, and military ranks idation reports help the domain experts to improve the qual- between the datasets, thus creating an interlinked knowl- ity of the source data. edge graph. Some parts of the data had to be left out of the online data A simple primary data model is used for the prisoner publication due to privacy issues. This is done automati- records, in which one prisoner record corresponds to one cally based on the date when a person has died. If there row in the source spreadsheet, with each column mapped is no information about an individual’s date of death, it is to a distinct property. So all of the personal information assumed that they may still be alive, and their personal in- about each captured individual is contained in the prisoner formation, including given names, is removed, effectively record, resembling the data model of the WarSampo death pseudonymizing them. For prisoners who are known to records (Koho et al., 2017). The properties and classes of have died less than 50 years ago, health related information prisoner records and death records have been harmonized is removed, based on the columns of the original spread- using the dumb-down principle of Dublin Core3 , i.e., by sheet that might contain health related information. using shared super-properties and super-classes where ap- plicable. By mapping columns directly to properties, the 23 Interlinking within WarSampo data can be shown to the end user in an intuitive way, re- Matching the people in the prisoner records to the ca. sembling the original spreadsheet. 100 000 people already existing in the WarSampo actor on- WarSampo uses the CIDOC Conceptual Reference Model tology is one of the most challenging aspects of the data (CRM)4 as the harmonizing data model. Prisoner records transformation pipeline. The data model and contents are are modeled as instances of the CRM document class E31 Document. 5 https://github.com/SemanticComputing/ Warsampo-schema 3 6 http://dublincore.org/usage/documents/ Source codes for data conversion and linking are available principles/ online: https://github.com/SemanticComputing/ 4 http://cidoc-crm.org WarPrisoners. different, and many pieces of personal information can be Structured information is also gathered of the events of go- missing on both sides. In the first results of the person ing missing and being captured, like the place and time. Bi- linking, we were able to link 1431 prisoner records to ex- ographically interesting information is also given as prose isting WarSampo person instances, corresponding to 32% about being captured, the cause of death and burial place, of all prisoner records (Koho et al., 2018a). The person and other information. These all are structured to con- linking uses probabilistic record linkage (Gu et al., 2003; tain the information source, and can often contain different Gregg and Eder, 2019) (aka. deduplication) with a ma- pieces of information from different sources. Information chine learning approach, in which each POW’s information on confiscated possessions and their estimated value sheds is compared with the information in the WarSampo per- light to what kind of valuable personal possessions a per- son instances to find matches that have high enough sim- son had. Information is also given about the occurrence ilarity. Initially the record linkage value comparisons were of a person in Soviet war propaganda magazines or fliers, weighted based on domain knowledge, which was then iter- either in pictures or text. ated for better accuracy, and finally a manually curated list of matches was taken to serve as training data for the ma- 3 Prisoners of War in the WarSampo Portal chine learning approach. The machine learning approach A new application perspective was created into the War- can adapt to data changes on both sides in the record link- Sampo portal for studying, exploring and analyzing the age, without having to manually inspect the linking results prisoners of war dataset as a whole. Also the existing War- and adjust the weights. sampo Persons perspective, which generates a “home page” New person instances are created from the unlinked pris- for each person in the WarSampo knowledge graph, was oner records and added into the actor ontology. With the extended to show possibly contradictory data originating probabilistic record linkage, it is possible that a record is from multiple sources (e.g. death records, prisoner records, not mapped simply because there is not enough informa- Wikipedia). The Prisoner perspective application is open- tion about either the POW record, or the person instance, source, and available online8 . to create a mapping between them. Modifying the informa- tion in either the POW data or in the actor ontology means 31 Biographical View in the Persons perspective that the whole record linkage process should be redone. The WarSampo Persons perspective offers a general search Other information is also linked to WarSampo domain on- of people in the WarSampo knowledge graph. Each person tologies. Of military ranks, 99% were linked to the War- is provided with a biographical view, a home page, that re- Sampo military ranks domain ontology. Of military units, assembles the biographical knowledge of the person from 91% were linked to pre-existing military units in the actor the WarSampo datasets, into a structured format. ontology. Figure 1 shows an example of a soldier’s home page, where Domain ontologies differ from each other by nature. For the information is combined from a prisoner record and a example, covering and disambiguating all military ranks is death record. The left side of the page contains a person se- clearly a simpler task than performing the same task with lector and a text box for filtering the people by name. The all wartime places. In general, it is not realistic to assume details of a selected person are displayed on the right. Infor- that the domain ontologies completely cover their domain. mation usually exists from birth to death, with a clear and Other information still to be linked to WarSampo domain understandable focus on the war-time events. A property ontologies are war-time municipalities. More accurate (e.g. occupation) may contain multiple values. In order to place information could also be linked, but due to the am- make the biographical view as transparent as possible, all biguous nature of the names, this would lead to a high level values have been supplemented with a reference to the in- of error, based on initial experiments. formation source. In the figure, source number 2 refers to The created Linked Data stores source information when the POW register. There is a total of 12 sources of informa- present in the original data. There are many ways of pre- tion for the particular person, which includes also a death senting this kind of provenance information in RDF (Har- record, and 10 different sources from the POW register. tig, 2009; Zhao et al., 2010). The approach used with the The values that have been linked to WarSampo domain on- prisoners of war dataset is storing source information using tologies are shown as links to corresponding home pages. RDF reification with the DCMI Metadata Terms7 property The idea here is that the WarSampo semantic portal acts as source. a customized graphical RDF browser, which makes it pos- sible for the user to find surprising connections between the 24 Biographical Data individual resources of the WarSampo knowledge graph. Each person’s basic personal information in the dataset con- tains columns like first and last names, dates of birth, return 32 Prosopographical Prisoners Perspective from captivity, and death, municipality of birth, domicile The Prisoners perspective is based on the previously re- and death, and occupation, marital status, and number of leased Casualties perspective (Koho et al., 2017). The main children. These enable building some understanding about design principle of these perspectives is to target one core the life of the person before the war, and in case of sur- class of WarSampo knowledge graph (e.g., prisoner record) vivors, also after the war. and provide the user with a faceted search (Tunkelang, 7 8 http://dublincore.org/documents/ https://github.com/SemanticComputing/ dcmi-terms/ prisoners-demo Figure 1: The Persons perspective showing part of a person’s home page. 2009; Oren et al., 2006) interface, which initially renders map, can be added rather easily to the application, and the a result set that contains all instances of the target class as existing ones extended as needed. a paginated table. This way we ease off the “blank search field problem”, where a new user does not know what kind 4 Implementation of query terms should be used for meaningful results. The The Prisoners perspective is an AngularJS9 web applica- initial result set can be narrowed down by using various tion, which consists of several modules. The facet func- facets (e.g., military unit or prison camp). tionality is implemented using SPARQL Faceter10 (Koho Figure 2 shows a part of the Prisoners perspective user in- et al., 2016), a module that provides terface. Facets are presented on the left of the user interface. The number of hits (instances of the target class) produced ⌅ a set of directives that work as configurable facets, by each facet value is calculated dynamically and is shown ⌅ a service that synchronizes the facet selections, in parenthesis. Facet values leading to an empty result set are hidden. To reduce unnecessary data fetching, most of ⌅ a service for updating the URL parameters based on the facets are disabled by default. They can be activated by facet selections, and retrieving the facet values from clicking the plus sign on the facet header. The facets are URL parameters, name, date of being captured as a POW, date of death, mil- ⌅ a service for retrieving SPARQL results based on the itary unit, military rank, POW camps where the person has facet selections, using a configurable query template. been, occupation, marital status, number of children, birth municipality, place of being captured, and place of death. For querying the SPARQL endpoint, mapping the SPARQL The results are displayed on the right side of the user inter- results into JavaScript objects and paging the results, we face. The result set, based on the facet selections, can be have developed another general module11 that is being used shown as a table, or shown with three different visualiza- across the WarSampo semantic portal. tions: In addition to the default paginated table result view, pow- ered by the ngTable12 directive, we have implemented sev- 1. a distribution chart over a selected property, with prop- eral reusable visualization directives for displaying the re- erty choices: military rank, military unit, occupation, sults on modern or historical maps or as statistical distribu- number of children, birth municipality, municipality tions. For the Prisoners perspective, a new sankey visual- of residence, place of being captured, and place of ization directive was built using Google Charts.13 death, The Persons perspective is part of the WarSampo portal An- 2. an age distribution chart at the time of capturing, gularJS core infrastructure 14 . It was extended to fetch data to the person’s homepage from the prisoner records, along 3. a sankey diagram of soldier life paths based on known with the source reifications. The page was redesigned and geographical locations at different times, starting from restructured to be able to integrate the data from the pris- the municipality of birth, and ending to the municipal- oner records, and to show the prisoner record data along ity of death. with the information from a person instance and a death record, of which the latter may or may not be present. The results display mode can be selected using the button in Showing and numbering the information sources was also the top bar. In Figure 2, the results are displayed as a table, a new addition. with each row corresponding to a single prisoner record, with several key properties mapped to separate columns. 5 Discussion Figure 3 shows the age distribution of all soldiers whose rank is private at the time when they have been captured as This paper presented first results of publishing the prison- a prisoner of war. Figure 4 shows the military rank distri- ers of war dataset as part of WarSampo. The POW data bution of the soldiers that were born in Helsinki. contains sensitive information about the individual citizens, The common usage scenario of the average user is to search some of whom are still alive. The publication of the data for information about their relatives who have participated has been delayed due to the evaluation as to what infor- in the war. This can be achieved most easily with the table mation can be legally published about the individuals, and view of results and using the different facets, and mostly what needs to be hidden. The dataset and new portal is ex- the name facet, where a person can search with just a part pected to be finally published in November 2019. of the name to get all the results containing that. Another The combination of faceted search and various result visu- way to find relatives, who historically are often situated in alization components forms the base of the user interface the same region, is to filter the results with the birth munic- 9 ipality facet. https://angularjs.org/ 10 https://github.com/SemanticComputing/ Another usage scenario is studying and analyzing the data angular-semantic-faceted-search by a historian or an interested citizen. The facets already 11 https://github.com/SemanticComputing/ provide distributions of the facet values, with the number angular-paging-sparql-service of hits after each value. When a selection is made in one 12 https://github.com/esvit/ng-table of the facets, all of the facets are updated to show the dis- 13 https://github.com/angular-google-chart/ tribution of values with that selection. Further analysis can angular-google-chart be done with the various visualizations of the facet results. 14 https://github.com/SemanticComputing/ New visualizations, e.g. locations of the POW camps on a warsampo-angular-app Figure 2: Prisoners perspective: facet selection results shown as a table view. of the Prisoners perspective. This design has proved to be ing, which can then be re-integrated easily into WarSampo. broadly applicable to many kinds of datasets. By browsing The Linked Data approach requires tighter co-operation through the facets, the user can quickly see what kind of with the domain experts and data publishers, especially in values have been used for different properties. This often the creation phase of historical information (Boonstra et al., reveals inconsistencies and spelling errors, if the property 2004), than more traditional data publishing ways. How- values have not been systemically entered or harmonized, ever, it is possible using Linked Data to create an under- or they are completely missing for a large number of re- standing about the whole of the war, by combining infor- sources. For estimating the completeness and the reliability mation from several datasets together, which would not be of the dataset, looking at the actual property values is often easy by studying the individual datasets directly. more important than focusing on data modeling details. The historical occupations in the WarSampo datasets have recently been harmonized into a manually curated SKOS- Maintaining interlinked datasets and domain ontologies based 15 ontology AMMO (Koho et al., 2019), to which the present new challenges (Auer et al., 2012; Maedche et al., prisoner records are linked. The ontology combines syn- 2003), as changes is one part need to be accounted for in onymous occupational labels into harmonized occupation other interlinked parts. The Linked Data environment is resources, and provides structures of social stratification not yet mature enough to have easy-to-use tools for non- and occupational groups. It will enable studying the pris- technical people to use for editing and maintaining inter- oner records using new facets in the future, such as social linked data. Hence, the POW data is still maintained using the spreadsheet with agreed upon formatting and structur- 15 https://www.w3.org/TR/skos-primer/ Figure 3: Prisoners perspective: age distribution of the soldiers with the military rank private. Figure 4: Prisoners perspective: statistics view class and field of work, and facilitate the use of the dataset sources, which in turn contain URL links to the document to answer new kinds of research questions of collaborating files. historians. Integrating data into a Linked Data infrastructure is more Integration of videos and other documents relating to the laborious than simpler ways of publishing the data as an prisoners of war, will be implemented later, and will consist independent data object, which does not communicate with of expressing the document metadata in terms of CIDOC other datasets. However, the result of the integration is an CRM, and linking the prisoners to the related document re- interlinked knowledge base, where the interlinked graphs enrich each other, creating a whole that is greater than the Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki sum of its parts (Hyvönen, 2012). Rantala, Esko Ikkala, Jouni Tuominen, and Kirsi Ker- avuori. 2019. Biographysampo – publishing and enrich- Acknowledgements ing biographies on the semantic web for digital humani- Reijo Nikkilä, Tiia Moilanen, and Pertti Suominen of The ties research. In Proceedings of the 16th Extendwed Se- National Prisoners of War Project worked on the data as mantic Web Conference (ESWC 2019), pages 574–589. domain experts. Katri Miettinen indexed related documents Springer-Verlag. for linking with persons. Eero Hyvönen. 2012. Publishing and Using Cultural Her- Our work was funded by the Association for Cherishing itage Linked Data on the Semantic Web. Synthesis Lec- the Memory of the Dead of the War16 , Teri-Säätiö, Open tures on the Semantic Web: Theory and Technology. Science and Research Initiative17 of the Finnish Ministry Morgan & Claypool Publishers, Palo Alto, USA. of Education and Culture, the Finnish Cultural Foundation, Mikko Koho, Erkki Heino, and Eero Hyvönen. 2016. and the Academy of Finland. SPARQL Faceter—Client-side Faceted Search Based on The authors wish to acknowledge CSC – IT Center for Sci- SPARQL. In Joint Proc. of the 4th International Work- ence, Finland, for computational resources. shop on Linked Media and the 3rd Developers Hackshop, number 1615. CEUR Workshop Proceedings. 6 References Mikko Koho, Eero Hyvönen, Erkki Heino, Jouni Tuomi- Teuvo Alava, Dmitri Frolov, and Reijo Nikkilä. 2003. nen, Petri Leskinen, and Eetu Mäkelä. 2017. Linked Rukiver. Suomalaiset sotavangit Neuvostoliitossa. death—representing, publishing, and using Second Helsinki: Edita. World War death records as linked open data. In Eva Sören Auer, Theodore Dalamagas, Helen Parkinson, and Blomqvist, Katja Hose, Heiko Paulheim, Agnieszka Bancilhon and. 2012. Diachronic linked data: towards Ławrynowicz, Fabio Ciravegna, and Olaf Hartig, editors, long-term preservation of structured interrelated infor- The Semantic Web: ESWC 2017 Satellite Events, pages mation. In Proceedings of the First International Work- 369–383. Springer-Verlag. shop on Open Data, pages 31–39. ACM. Mikko Koho, Erkki Heino, Esko Ikkala, Eero Hyvönen, Onno Boonstra, Leen Breure, and Peter Doorn. 2004. Past, Reijo Nikkilä, Tiia Moilanen, Katri Miettinen, and Pertti present and future of historical information science. His- Suominen. 2018a. Integrating prisoners of war dataset torical Social Research, 29(2):4–132. into the WarSampo linked data infrastructure. In Pro- Antske Fokkens, Serge ter Braake, Niels Ockeloen, Piek ceedings of the Digital Humanities in the Nordic Coun- Vossen, Susan Legêne, Guus Schreiber, and Victor tries 3rd Conference (DHN 2018). CEUR Workshop Pro- de Boer. 2017. Biographynet: Extracting relations be- ceedings, March. Vol 2084. tween people and events. In Europa baut auf Biogra- Mikko Koho, Esko Ikkala, Erkki Heino, and Eero Hyvönen. phien, pages 193–224. New Academic Press, Wien. 2018b. Maintaining a linked data cloud and data ser- Eileen Gardiner and Ronald G. Musto. 2015. The Digital vice for Second World War history. In Digital Her- Humanities: A Primer for Students and Scholars. Cam- itage. Progress in Cultural Heritage: Documentation, bridge University Press, New York, NY, USA. Preservation, and Protection. 7th International Confer- Forest Gregg and Derek Eder. 2019. Dedupe. https: ence, EuroMed 2018, Nicosia, Cyprus, volume 11196. //github.com/dedupeio/dedupe. Springer-Verlag, October-November. Lifang Gu, Rohan Baxter, Deanne Vickers, and Chris Mikko Koho, Lia Gasbarra, Jouni Tuominen, Heikki Rainsford. 2003. Record linkage: Current practice and Rantala, Ilkka Jokipii, and Eero Hyvönen. 2019. future directions. CSIRO Mathematical and Information AMMO Ontology of Finnish Historical Occupations. In Sciences Technical Report, 3:83. Proceedings of the The First International Workshop Olaf Hartig. 2009. Provenance information in the web of on Open Data and Ontologies for Cultural Heritage data. In Proceedings of the WWW2009 Workshop on (ODOCH’19), volume 2375. CEUR Workshop Proceed- Linked Data on the Web, volume 538 of CEUR Work- ings, June. shop Proceedings. Petri Leskinen, Mikko Koho, Erkki Heino, Minna Tamper, Tom Heath and Christian Bizer. 2011. Linked Data: Evolv- Esko Ikkala, Jouni Tuominen, Eetu Mäkelä, and Eero ing the web into a global data space. Synthesis Lectures Hyvönen. 2017. Modeling and using an actor ontol- on The Semantic Web: Theory and Technology. Morgan ogy of Second World War military units and personnel. & Claypool Publishers, Palo Alto, USA. In Proceedings of the 16th International Semantic Web Eero Hyvönen, Erkki Heino, Petri Leskinen, Esko Ikkala, Conference (ISWC 2017). Springer-Verlag, October. Mikko Koho, Minna Tamper, Jouni Tuominen, and Eetu Alexander Maedche, Boris Motik, Ljiljana Stojanovic, Mäkelä. 2016. WarSampo data service and semantic Rudi Studer, and Raphael Volz. 2003. An infrastructure portal for publishing linked open data about the Second for searching, reusing and evolving distributed ontolo- World War history. In The Semantic Web — Latest Ad- gies. In Proc. of the twelfth international conference on vances and New Domains (ESWC 2016), pages 758–773. World Wide Web, pages 439–448. ACM Press. Springer-Verlag. Eyal Oren, Renaud Delbru, and Stefan Decker. 2006. Extending faceted navigation for RDF data. In In- 16 http://www.sotavainajat.net/in_english ternational semantic web conference, pages 559–572. 17 http://openscience.fi/ Springer–Verlag. Daniel Tunkelang. 2009. Faceted search. Synthesis lec- tures on information concepts, retrieval, and services. Morgan & Claypool Publishers. Koenraad Verboven, Myriam Carlier, and Jan Dumolyn. 2007. A short manual to the art of prosopography. In Prosopography approaches and applications. A hand- book, pages 35–70. Unit for Prosopographical Research (Linacre College). Christopher Warren, Daniel Shore, Jessica Otis, Lawrence Wang, Mike Finegold, and Cosma Shalizi. 2016. Six de- grees of Francis Bacon: A statistical method for recon- structing large historical social networks. Digital Hu- manities Quarterly, 10(3). Jun Zhao, Christian Bizer, A Gil, Paolo Missier, and Satya Sahoo. 2010. Provenance requirements for the next ver- sion of RDF. In Proceedings of the W3C Workshop – RDF Next Steps. W3C. https://www.w3.org/ 2009/12/rdf-ws/papers/ws08.