<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Linked Data - A Paradigm Shift for Publishing and Using Biography Collections on the Semantic Web</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eero Hyv o¨nen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Petri Leskinen</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Minna Tamper</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Heikki Rantala</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Esko Ikkala</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jouni Tuominen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kirsi Keravuori</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Finnish Literature Society https://seco.cs.aalto.fi/projects/biografiasampo/en/</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>HELDIG - Helsinki Centre for Digital Humanities, University of Helsinki</institution>
          ,
          <country country="FI">Finland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Semantic Computing Research Group (SeCo), Aalto University</institution>
          ,
          <country country="FI">Finland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper argues for making a paradigm shift in publishing and using biographical dictionaries on the Web, based on Linked Data. The idea is to represent biographical data in a harmonized, semantically interoperable form, which enables 1) data enrichment by aggregating linked content from complementary, distributed, and heterogeneous data sources, as well as by reasoning, and 2) development of intelligent services using machine “understandable” data. Based on the aggregated global knowledge graph, published in a SPARQL endpoint, tooling for 1) biographical research of individual persons as well as for 2) prosopographical research on groups of people can be provided. As a demonstration of these ideas, we discuss the new in-use linked data service and semantic portal BIOGRAPHYSAMPO - Finnish Biographies on the Semantic Web that quickly attracted thousands of end users on the Web. This semantic portal is based on a knowledge graph extracted automatically from a collection of 13 100 textual biographies, written by 980 scholars. The texts are enriched with data linking to 16 external data sources and by harvesting external collection data from libraries, museums, and archives. Reasoning is used for query expansion and for discovering serendipitous relations between entities, such as persons and places.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Biographical Dictionaries on the Web</title>
      <p>
        Biographical dictionaries (Keith, 2004) may contain tens
of thousands of short biographies of historical persons of
importance. Traditionally, such dictionaries have been
published as printed book series. The Oxford Dictionary of
National Biography1 (ODNB), with more than 60 000 lives,
was first published on-line in 2004, and since then major
biographical dictionaries have opened their editions on the
Web with search engines for finding and (close) reading
biographies of interest. On-line national biographical
collections include USA’s American National Biography2,
Germany’s Neue Deutsche Biographie3, Biography Portal of
the Netherlands4, Dictionary of Swedish National
Biography5, and National Biography of Finland6 (NBF).
ODNB and other early adopters of web technology started
the paradigm shift in publishing and reading biographical
dictionaries on the Web. We call such systems 2.
generation publications. This paper argues for taking the next
step forward into 3. generation systems, i.e., to
publishing and using biographical dictionaries as Linked Data on
the Semantic Web. The goal is to serve both machine and
human readers, and support both close and distant
reading
        <xref ref-type="bibr" rid="ref46">(Shultz, June 24 2011)</xref>
        . To demonstrate and
evaluate this idea in practise, we present the new in-use
system BIOGRAPHYSAMPO – Finnish Biographies on the
Se1https://www.oxforddnb.com
2http://www.anb.org/aboutanb.html
3http://www.ndb.badw-muenchen.de/ndb_auf
gaben_e.htm
4http://www.biografischportaal.nl/en
5https://sok.riksarkivet.se/Sbl/Start.asp
x?lang=en
6https://kansallisbiografia.fi/english
mantic Web7
        <xref ref-type="bibr" rid="ref12 ref53">(Hyvo¨ nen et al., 2019)</xref>
        based on the NBF and
other biography collections of the Finnish Literature
Society8. The idea is to 1) transform textual biographies into
Linked Data by using language technology and knowledge
extraction, to 2) enrich the data by linking it to internal and
external data sources and by reasoning, to 3) publish the
data as a Linked Data service and a SPARQL endpoint on
the web
        <xref ref-type="bibr" rid="ref18">(Heath and Bizer, 2011; Hyvo¨ nen, 2012)</xref>
        , and to 4)
create end-user applications on top of the service, including
data-analytic tools and visualizations for distant reading of
Big Data.
      </p>
      <p>
        This paper considers BIOGRAPHYSAMPO from a
publishing paradigm shift perspective, complementing our
earlier papers: In
        <xref ref-type="bibr" rid="ref12 ref53">(Hyvo¨ nen et al., 2019)</xref>
        , an overview of
BIOGRAPHYSAMPO from an end-user’s point of view is
presented; Knowledge extraction from texts is concerned
in (Tamper et al., 2018); In
        <xref ref-type="bibr" rid="ref53">(Tamper et al., 2019)</xref>
        network
analysis of the biographies is in focus; In
        <xref ref-type="bibr" rid="ref12 ref25 ref35 ref53">(Hyvo¨ nen and
Rantala, 2019)</xref>
        relational search of named entities is
discussed, yet another separate application perspective of the
portal.
      </p>
      <p>In the following, we first present the underlying “Sampo”
publishing model and series of semantic portals whose new
member BIOGRAPHYSAMPO is. After this the
underlying knowledge graph is presented, and the new linked data
based possibilities for biographical and prosopographical
research are illustrated. In conclusion, related works are
discussed and contributions summarized.</p>
      <p>7BIOGRAPHYSAMPO is available at http://biografias
ampo.fi. More information and publications are available at
the project homepage https://seco.cs.aalto.fi/pro
jects/biografiasampo/en/.</p>
      <p>8https://www.finlit.fi/en</p>
    </sec>
    <sec id="sec-2">
      <title>2. Sampo Model for Linked Data Publishing</title>
      <p>The ideas of the Semantic Web (SW) and Linked Data can
be applied to address the problems of (semantic) data
interoperability and distributed content creation at the same
time, as depicted in Fig. 1. Here the publication system
is illustrated by a circle. A shared semantic ontology
infrastructure is situated in the middle. It includes mutually
aligned metadata and shared domain ontologies, modeled
using SW standards. If content providers outside of the
circle provide the system with (meta)data, it is automatically
linked and enriched with each other and forms a knowledge
graph. For example, if metadata about a painting created
by Picasso comes from an art museum, it can be enriched
(linked) with, e.g., biographies from Wikipedia and other
sources, photos taken of Picasso or by him, information
about his wives, books in a library describing his works of
art, related exhibitions open in museums, and so on. At the
same time, the contents of any organization in the portal
having Picasso related material get enriched by the
metadata of the new artwork entered in the system. This is a
win-win “business model” for everybody to join such a
system; collaboration pays off.</p>
      <p>
        However, the model also creates new challenges. In
addition to enriching information also conflicting data from
different sources may be aggregated, leading to problems
of data fusion. A solution to this is to maintain provenance
metadata about the primary sources (cf., e.g.,
        <xref ref-type="bibr" rid="ref31">(Koho et al.,
2019)</xref>
        ). This is needed also in order to promote and
separate the identities of the data providers and to acknowledge
their distinct contributions in the Sampo. Yet another
issue is how to maintain the Sampo when aggregated data or
the ontology infrastructure changes. To make this as
automatic as possible, human involvement in the annotation
and publishing pipeline should be minimized, as suggested
in (Koho et al., 2018). However, taking the human out of
the loop may lower the quality of data, and more source
criticism and understanding about the limitations of the
automatically annotated and aligned data is needed from the
end-user
        <xref ref-type="bibr" rid="ref12 ref53">(Hyvo¨ nen et al., 2019)</xref>
        . In general, more
collaboration and mutual agreements are needed between the
publishers, which complicates the publishing process. Also the
underlying technology needs new kind of expertise on
semantic computing.
      </p>
      <p>
        The model of Fig. 1 fits well with Linked Data idea of
providing data as a service and as a live SPARQL
endpoint (Heath and Bizer, 2011), on top of which
independent applications can be created on the client side without
server side concerns. We call this whole the Sampo9 model
        <xref ref-type="bibr" rid="ref18">(Hyvo¨ nen, 2012)</xref>
        .
      </p>
      <p>The model has been developed and tested in a
series of several practical case studies, including
CultureSampo10 (2008) for cross-cultural contents, TravelSampo11
9In Finnish mythology and the epic Kalevala, ”Sampo” is a
mythical artefact of indeterminate type that gives its owner
richness and good fortune, an ancient metaphor of technology.</p>
      <p>10https://seco.cs.aalto.fi/applications/ku
lttuurisampo/</p>
      <p>11https://seco.cs.aalto.fi/applications/tr
avelsampo/
(2011) for tourism, BookSampo12 (2011) for fiction
literature, WarSampo13 (2015) for military history, and
NameSampo14 (2019) for toponomastic research of historical
place names. Our experiences suggest that the Sampo
model is a promising way to create useful systems that
endusers like. For example, in 2018, BookSampo had ca. 2
million users and WarSampo 230 000.</p>
      <p>
        In BIOGRAPHYSAMPO, the knowledge graph was
extracted from the biography collections listed in Table 1,
linked not only internally but also enriched with links to
the external data sources listed in Table 2. In addition,
data was harvested from 1) the art collection data of the
National Gallery of Finland15, 2) the National
bibliography of Finland Fennica16, 3) BookSampo semantic portal17
linked data for fiction literature
        <xref ref-type="bibr" rid="ref41">(Ma¨kela¨ et al., 2011)</xref>
        , 4) the
critical edition of J.V. Snellman’s works
        <xref ref-type="bibr" rid="ref48">(Snellman, 2002
2004)</xref>
        18, and 5) the Finnish history ontology HISTO19.
The core biographies were converted into RDF by using a
natural language pipeline described in more detail in
(Tamper et al., 2018). The data model used is an extension of
CIDOC CRM (Doerr, 2003; Le Boeuf et al., 2019) that we
call Bio CRM
        <xref ref-type="bibr" rid="ref21 ref59">(Tuominen et al., 2018)</xref>
        . In this model, the
life of a person is essentially a chain of events in time and
space which the person participated in different roles.
The knowledge graph was published in a Linked Data
service20 on top of which the semantic portal
BIOGRAPHYSAMPO with seven application perspectives was
implemented using a standard SPARQL endpoint API.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. New Ways for Using Biographies</title>
      <sec id="sec-3-1">
        <title>3.1. From Text Publishing to Tooling for DH Research</title>
        <p>
          Data analysis in Digital Humanities (DH) is typically done
partly by the machine, partly by the human. In
visualizations, such as maps, timelines, and networks, the machine
presents target data in a form from which the human user
is able to make interpretations more easily. In statistics,
e.g., pie charts, line charts, and histograms are used.
Another type of tool
          <xref ref-type="bibr" rid="ref55">ing is network analysis (Newman, 2018</xref>
          ),
where different kind of connections between entities, such
as family relations between persons or references between
texts can be represented as graphs for visual inspection and
mathematical analysis. In data-analysis and knowledge
discovery, statistical or other patterns of data are searched for
12https://seco.cs.aalto.fi/applications/ki
rjasampo/
        </p>
        <p>13https://seco.cs.aalto.fi/projects/sotasa
mpo/en/</p>
        <p>14https://seco.cs.aalto.fi/projects/nimisa
mpo/</p>
        <p>15https://www.kansallisgalleria.fi/en/avoi
n-data/</p>
        <p>16https://www.kansalliskirjasto.fi/en/news
/finnish-national-bibliography-released-a
s-open-data
17http://kirjasampo.fi
18http://snellman.kootutteokset.fi
19https://seco.cs.aalto.fi/ontologies/hist
o/</p>
        <p>20Hosted by the Linked Data Finland service http://ldf.
fi.
in order to find “interesting”, serendipitous (Aylett et al.,
2012) new knowledge. Techniques such as topic modelling
(Brett, 2012) fall in this category. The results also here
typically need human interpretation, as statistical methods are
usually unable to explain their results. In knowledge-based
systems, knowledge structures can be used for this.
Many of the methods and tools above are well-defined and
domain independent, and there are lots software packages
available for using them, such as Gephi21, R (Field et al.,
2015), and various Python and JavaScript libraries.
However, each of them have their own input formats and user
interfaces, and need specific skills from the user.
Furthermore, visualizations are crafted case by case; tools for
formulating, adjusting, and comparing analysis results in some
general ways would be helpful for the user.</p>
        <p>Second generation dictionaries of biographies on the Web
are used in the following traditional way: a search box or
form is filled up specifying the person(s) whose biographies
are searched for. Then the search button is pushed, and a
list of hits is shown that can be opened for close reading by
clicking. The paradigm chance of publishing biographies
as linked data (third generation systems) makes it
possible to build systems based on live data services, especially
SPARQL endpoints. In this way, also other parties can
reuse the data in their own applications. It is possible to not
21https://gephi.org
only publish biographies with search interfaces, but also to
incorporate ready to use tooling for DH research on top of
the data service. In addition, the SPARQL endpoint makes
it possible to study the data by custom designed queries in
situations where the ready to use interfaces are not enough
for problem solving. The SPARQL API can also be used
for extracting and downloading filtered subsets of data from
the endpoint in different formats (e.g., CSV) to be used in
external tools, such as spreadsheets, R, or Gephi. Some of
these new possibilities are illustrated below by the ready to
use tools of BIOGRAPHYSAMPO.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Examples: BIOGRAPHYSAMPO at Work</title>
        <p>Problem solving in DH often has two phases, as in the
prosopographical research method (Verboven et al., 2007,
p. 47): First, a target group of entities in the data is selected
that share desired characteristics for solving the research
question at hand (in the case of prosopography, a people
group is selected). Second, the target group is analyzed, and
possibly compared with other groups, in order to solve the
research question. Using BIOGRAPHYSAMPO based on the
same pattern: First, faceted search is used for filtering out a
biography or a group of them for prosopography. After this,
versatile ready to use tooling can be applied for reading a
single biography or for analysing groups of biographies and
comparing them with each other or other groups.22
Enriching the Reading Experience After finding a
biography of interest, BIOGRAPHYSAMPO provides the user with
an enriched reading view of the protagonist’s life by
creating automatically a ”homepage” for each person, based
on 1) data linking and 2) reasoning. Fig. 2 shows as an
example the homepage of Eliel Saarinen (1873–1950), a
prominent Finnish architect. The page contains six (6) tabs
providing different biographical views of the person, here
two pages based on the NBF, data at the Linked Data
Finland service, a genealogical family tree and homepage by
the Geni.com service, and the Finnish Wikipedia article.
The entry is linked to seven (7) external data sources on the
22A short video is available on the Web illustrating the ideas of
BIOGRAPHYSAMPO: https://vimeo.com/328419960.
web. On the right, recommendation links to related
biographies are given, e.g., to similar biographies based on their
linguistic content. On the top of the page, there are five (5)
tabs providing data-analytic views of Saarinen.</p>
        <p>Network Analysis For example, Fig. 3 presents his
egocentric network based on the links between the bios in the
NBF, with a coloring scheme indicating persons of
different types. The depth and other parameters of the network
can be controlled by the widgets on the left. In Fig. 4,
another tab visualizes the international events of Saarinen’s
life on a map and on four timelines of different event types
(personal life, career, artistic or scientific creations, and
accolades) for a spatiotemporal analysis.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Filtering Groups for Prosopography To support proso</title>
        <p>pography, BIOGRAPHYSAMPO employs faceted search for
filtering out not only individual persons but also groups of
them sharing some properties, such as profession, place
of birth, place of education, working organization, etc.
Once the target group has been selected, various generic
data-analytic tools and visualizations can be applied to the
group: 1) Statistical tools include histograms showing
various numeric value distributions of the biographees, e.g.,
their ages, number of spouses and children, and pie charts
visualizing proportional distributions of professions,
societal domains, and working organizations. 2) Event maps
show how different events (personal life events, career
events, artistic and scientific creation events, and accolades)
participated in by the biographees are distributed on maps.
3) Life charts summarize the lives of persons from a
transitional perspective as blue-red arrows from the birth places
(blue end) to the places of death (red end).</p>
        <p>These tools and visualization can be applied not only to one
target group but also to two parallel groups in order to
compare them. For example, Fig. 5 compares the generals and
admirals of the Grand Duchy of Finland (1809–1917) (on
the left) with the clergy (1800–1920) (on the right). With
a few selections from the facets the user can see that, for
some reason, quite a few officers moved the to south to die
while the Lutheran ministers stayed more in Finland. The
arrows are interactive. For example, by clicking on the
peculiar upper arrow to the east, one can find out that this
arrow was due to general Gustaf A. Silfverhjelm’s (1799–
1864) biography, where one can learn that he become a
chief cartographer in western Siberia where he died.
Searching for Historical Places BIOGRAPHYSAMPO also
provides the user with a map search view that projects the
places in which the ca. 100 000 biographical events
extracted from the biographies are projected on the places
where they occurred. The maps in this view are not only
contemporary ones but also historical maps served by a
separate historical ontology and map service Hipla.fi23. Many
important events of Finnish history took place in the eastern
parts of the country that was annexed to the Soviet Union
after the Second World War. Old Finnish places there may
have been destroyed, placenames been changed, and names
are now written in Russian. Using semi-transparent
historical maps on top of contemporary maps solves the problem
by giving a better historical context for the events.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Relational Knowledge Discovery To utilize reasoning and</title>
        <p>
          knowledge discovery in Linked Data, an application
perspective for finding ”interesting/serendipitous” (Aylett et
al., 2012) connections in the biographical knowledge graph
was created. This application idea is related to relational
search
          <xref ref-type="bibr" rid="ref21 ref28">(Lohmann et al., 2010; Tartari and Hogan, 2018)</xref>
          .
However, in our case a new knowledge-based approach was
developed to find out in what ways (groups of) people are
related to places and areas. This method rules out non-sense
relations effectively and is able to create natural language
explanations for the connections
          <xref ref-type="bibr" rid="ref12 ref25 ref35 ref53">(Hyvo¨nen and Rantala,
2019)</xref>
          . The queries are formulated and the problems are
solved using faceted search. For example, the query ”How
are Finnish artists related to Italy?” is solved by
selecting ”Italy” from the place facet and ”artist” from the
profession facet. The results include connections of different
types (that could be filtered in another facet), e.g., ”Elin
Danielson-Gambogi received in 1899 the Florence City Art
Award”. The system understands, for example, that
Florence is in Italy based on the historical place ontology.
Text Analysis of Biographies The biographies can also
be analyzed by using linguistic analysis, providing yet
another different perspective for studying them. Both
individual bios as well as groups of them can be analyzed and
compared with each other as in prosopography above. For
example, it turns out that the biographies of female
members of the Finnish Parliament frequently contain the words
”family” and ”child”, but these words are seldom used in
the biographies of their male colleagues. The texts,
analyzed by a natural language processing pipeline (Tamper et
23http://hipla.fi
al., 2018), are stored in a separate knowledge graph of over
100 million triples.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Related Works and Contributions</title>
      <p>
        Aside the business of publishing biographical dictionaries
in print and on the Web, representing and analyzing
biographical data has grown into a new research and
application field. In 2015, the first Biographical Data in
Digital World workshop BD2015 was held presenting several
works on studying and analyzing biographies as data (ter
Braake et al., 2015),
        <xref ref-type="bibr" rid="ref9">and the proceedings of BD2017</xref>
        contain more similar works (Fokkens et al., 2017b).
BIOGRAPHYSAMPO is a result of research in this area and
is related to several other works. In (Larson, 2010) analytic
visualizations were created based on a U.S. Legislator
registry database. The work on BIOGRAPHYSAMPO is
continuation to two Semantic NBF demonstrators
        <xref ref-type="bibr" rid="ref50">(Hyvo¨nen
et al., 2014; Hyvo¨nen et al., 2018)</xref>
        , and the idea has been
applied also to a historical registry of stu
        <xref ref-type="bibr" rid="ref4">dents (Hyvo¨nen
et al., 2017</xref>
        ) and to the U.S. Legislator data (Miyakita et
al., 2018). However, BIOGRAPHYSAMPO extends these
systems into several new directions in terms of language
technology used, the DH tooling provided, such as network
analysis views, relational search, and text analysis views
for studying the language of the biographies. Also more
heterogeneous datasets are now studied and used.
Extracting RDF and OWL data from natural language texts
has been studied in several works in semantic web
research, cf., e.g., (Gangemi et al., 2017). In BiographyNet24
        <xref ref-type="bibr" rid="ref61 ref7">(Fokkens et al., 2017a)</xref>
        , language technology was applied
to extracting entities and relations in RDF using the
biographies of the Biography Portal of the Netherlands as
data. This work was related to the larger NewsReader
project for extracting structured data from news (Rospocher
et al., 2016). The work on BiographyNet focuses more on
challenges of natural language processing and managing
the provenance information of data from multiple sources,
while the focus of BIOGRAPHYSAMPO is on providing the
end-users, both DH researchers and the general public, with
intelligent search and browsing facilities, enriched reading
experience, and easy to use data-analytic tooling for
biography and prosopography. Extracting and studying
biographical networks has also been researched in the Six Degrees of
Francis Bacon25 (Warren et al., 2016) project. The
statistics views and idea of analysing the biographies as a
collection of texts in BIOGRAPHYSAMPO is related to (Warren,
2018) where the ODNB is analysed as an artifact. In the
latter works, Linked Data is not used.
      </p>
      <p>These lines of research are related to ours as they are
based on the idea of extracting semantic structures from
the largely unstructured biographical text collections, and
on using the data for DH research in biography and
prosopography. In addition and in contrast to the related works,
BIOGRAPHYSAMPO employs the “Sampo model” where
the data is enriched through a shared content
infrastructure by related external heterogeneous datasets, here, e.g.,
collection databases of museums, libraries, and archives, a
24http://www.biographynet.nl
25http://www.sixdegreesoffrancisbacon.com
critical edition, genealogical data, and various biographical
data sources and semantic portals online. Another
difference is that in our work, a main goal has been to develop
and provide versatile DH tooling for end-users on top of a
Linked Data SPARQL endpoint.</p>
      <p>This paper presented and demonstrated the vision of a
paradigm shift in publishing biography collections on the
Semantic Web. The vision has also been operationalized and
implemented as the semantic portal BIOGRAPHYSAMPO
now in use on the Web by thousands of users. The
biographical data of the portal was extracted and aggregated
automatically by the computer and has not been fully
validated by human experts, which would be impossible due to
the amount and complexity of the big data. This is a typical
situation in DH research, and calls for using more source
criticism when interpreting the analyses than when dealing
with human curated datasets. The quality and completeness
of the BIOGRAPHYSAMPO data has not yet been analyzed
formally, but our informal tests suggest that the results are
very useful even if errors are also encountered. This is the
price to be paid for advanced end-user services and distant
reading on distributed heterogeneous biographical data.
Acknowledgements This research was part of the Severi
project26, funded mainly by Business Finland. Thanks to
CSC – IT Center for Science, Finland, for computational
server resources for the data service and applications.</p>
    </sec>
    <sec id="sec-5">
      <title>5. References</title>
      <p>R. S. Aylett, D. S. Bental, R. Stewart, J. Forth, and
G.Wiggins. 2012. Supporting serendipitous discovery.
26http://seco.cs.aalto.fi/projects/severi</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>ference)</source>
          ,
          <fpage>23</fpage>
          -
          <lpage>25</lpage>
          October,
          <year>2012</year>
          , Aberdeen,
          <string-name>
            <given-names>UK. Megan R.</given-names>
            <surname>Brett</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Topic modeling: A basic introduc-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          tion.
          <source>Journal of Digital Humanities</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ).
          <source>Martin Doerr</source>
          .
          <year>2003</year>
          .
          <article-title>The CIDOC CRM-an ontological</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Magazine</surname>
          </string-name>
          ,
          <volume>24</volume>
          (
          <issue>3</issue>
          ):
          <fpage>75</fpage>
          -
          <lpage>92</lpage>
          . Andy Field, Jeremy Miles, and
          <string-name>
            <given-names>Zoe</given-names>
            <surname>Field</surname>
          </string-name>
          .
          <year>2015</year>
          . Discover-
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>de Boer</surname>
          </string-name>
          . 2017a. BiographyNet: Extracting relations be-
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          phien, pages
          <fpage>193</fpage>
          -
          <lpage>224</lpage>
          . New Academic Press, Wien. Antske Fokkens, Serge ter Braake, Ronald Sluijter,
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Paul</given-names>
            <surname>Arthur</surname>
          </string-name>
          , and
          <string-name>
            <surname>Eveline</surname>
          </string-name>
          Wandl-Vogt, editors.
          <source>2017b.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>BD2017 Biographical Data in a Digital World</source>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <source>CEUR Workshop Proceedings</source>
          , Vol-
          <volume>1399</volume>
          . Aldo Gangemi, Valentina Presutti, Diego Reforgiato Recu-
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>and Misael</given-names>
            <surname>Mongiov</surname>
          </string-name>
          `ı.
          <year>2017</year>
          .
          <article-title>Semantic web machine</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <article-title>reading with FRED</article-title>
          .
          <source>Semantic Web Journal</source>
          ,
          <volume>8</volume>
          :
          <fpage>873</fpage>
          -
          <lpage>893</lpage>
          . Tom Heath and
          <string-name>
            <given-names>Christian</given-names>
            <surname>Bizer</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Linked Data: Evolv-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <article-title>ing the Web into a Global Data Space (1st edition)</article-title>
          . Syn-
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          nology. Morgan &amp; Claypool. Eero Hyvo¨ nen and
          <string-name>
            <given-names>Heikki</given-names>
            <surname>Rantala</surname>
          </string-name>
          .
          <year>2019</year>
          . Knowledge-
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>graphs. In DHN2019</surname>
          </string-name>
          , Digital Humanities in the Nordic
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Countries</surname>
          </string-name>
          <year>2019</year>
          .
          <source>CEUR Workshop Proceedings</source>
          , Vol-
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>2364. Eero Hyvo¨ nen, Miika Alonen, Esko Ikkala, and Eetu</mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          Ma¨kela¨.
          <year>2014</year>
          .
          <article-title>Life stories as event-based linked data:</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>ISWC</surname>
          </string-name>
          <year>2014</year>
          Posters &amp;
          <article-title>Demonstrations Track</article-title>
          . CEUR
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <source>Workshop Proceedings</source>
          , Vol-
          <volume>1272</volume>
          . Eero Hyvo¨ nen.
          <year>2012</year>
          .
          <article-title>Publishing and using cultural her-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>nen</surname>
            , and
            <given-names>Laura</given-names>
          </string-name>
          <string-name>
            <surname>Sirola</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Reassembling and en-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Language</surname>
          </string-name>
          ,
          <source>Technology and Knowledge</source>
          , pages
          <fpage>113</fpage>
          -
          <lpage>119</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Tuominen</surname>
            ,
            <given-names>and Kirsi</given-names>
          </string-name>
          <string-name>
            <surname>Keravuori</surname>
          </string-name>
          .
          <year>2018</year>
          . Semantic na-
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <article-title>tal Humanities in the Nordic Countries</article-title>
          , 3rd Conference
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <source>(DHN</source>
          <year>2018</year>
          ), pages
          <fpage>372</fpage>
          -
          <lpage>385</lpage>
          . CEUR Workshop Proceed-
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>ings</surname>
          </string-name>
          , Vol-
          <volume>2084</volume>
          . Eero Hyvo¨ nen, Petri Leskinen, Minna Tamper, Heikki
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>avuori.</surname>
          </string-name>
          <year>2019</year>
          .
          <article-title>BiographySampo - publishing</article-title>
          and enrich-
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <article-title>ities research</article-title>
          .
          <source>In Proceedings of the 16th Extended</source>
          Se-
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <source>mantic Web Conference (ESWC</source>
          <year>2019</year>
          ). Springer-Verlag.
          <source>Thomas Keith</source>
          .
          <year>2004</year>
          .
          <article-title>Changing conceptions of National</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          2018.
          <article-title>Maintaining a linked data cloud and data ser-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <surname>Preservation</surname>
          </string-name>
          , and
          <string-name>
            <surname>Protection</surname>
          </string-name>
          . 7th International Confer-
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          ence,
          <source>EuroMed</source>
          <year>2018</year>
          , Nicosia, Cyprus, volume
          <volume>11196</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <source>Springer-Verlag. Mikko Koho</source>
          , Esko Ikkala, and Eero Hyvo¨ nen.
          <year>2019</year>
          . Re-
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <article-title>World War on the semantic web</article-title>
          .
          <source>In BD-2019</source>
          , Biograph-
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <source>ical Data in a Digital World</source>
          <year>2019</year>
          . CEUR Workshop Pro-
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          ceedings, http://ceur-ws.
          <source>org. Accepted. Ray Larson</source>
          .
          <year>2010</year>
          . Bringing lives to light: Biogra-
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          and Stephen Stead, editors.
          <year>2019</year>
          . Definition of
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>6.2.6. ICOM/CIDOC Documentation Standards Group</mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          <article-title>cidoc-crm</article-title>
          .org/Version/version-6.2.6.
          <string-name>
            <surname>Steffen</surname>
            <given-names>Lohmann</given-names>
          </string-name>
          , Philipp Heim, Timo Stegemann, and
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          <source>Ju¨ rgen Ziegler</source>
          .
          <year>2010</year>
          .
          <article-title>The RelFinder user interface</article-title>
          : In-
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          interest.
          <source>In Proceedings of the 14th International Con-</source>
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          <article-title>ference on Intelligent User Interfaces (IUI</article-title>
          <year>2010</year>
          ), pages
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          421-
          <fpage>422</fpage>
          . ACM. Eetu Ma¨kela¨, Kaisa Hype´n, and Eero Hyvo¨ nen.
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          201, pages
          <fpage>173</fpage>
          -
          <lpage>188</lpage>
          . Springer-Verlag.
          <source>Goki Miyakita</source>
          , Petri Leskinen, and Eero Hyvo¨ nen.
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          <article-title>ical research of legislators</article-title>
          . In 7th International Confer-
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          ence,
          <source>EuroMed</source>
          <year>2018</year>
          , Nicosia, Cyprus. Springer-Verlag.
          <source>Mark Newman</source>
          .
          <year>2018</year>
          . Networks. Oxford University Press. Marco Rospocher, Marieke van Erp,
          <string-name>
            <surname>Piek Vossen</surname>
          </string-name>
          , Antske
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Ploeger</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Tessel</given-names>
            <surname>Bogaard</surname>
          </string-name>
          .
          <year>2016</year>
          . Building
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          <string-name>
            <surname>Web</surname>
          </string-name>
          ,
          <volume>37</volume>
          :
          <fpage>132</fpage>
          -
          <lpage>151</lpage>
          . Kathryn Shultz. June,
          <volume>24</volume>
          ,
          <year>2011</year>
          . What is distant reading?
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          <year>2011</year>
          /06/26/books/review/the-mechani
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          <source>accessed: 13 August</source>
          <year>2018</year>
          .
          <string-name>
            <surname>Johan</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Snellman</surname>
          </string-name>
          . 2002-
          <fpage>2004</fpage>
          . J. V. Snellman: Kootut
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          <source>teokset 1-24. Ministry of Education and Culture,</source>
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          <string-name>
            <given-names>Eero</given-names>
            <surname>Hyvo</surname>
          </string-name>
          ¨ nen.
          <year>2018</year>
          .
          <article-title>Using biographical texts as linked</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          7th International Conference, EuroMed
          <year>2018</year>
          , Nicosia,
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          <string-name>
            <surname>Cyprus</surname>
          </string-name>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>I</given-names>
          </string-name>
          , pages
          <fpage>125</fpage>
          -
          <lpage>137</lpage>
          . Springer-
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          <string-name>
            <surname>Verlag. Minna Tamper</surname>
          </string-name>
          ,
          <source>Eero Hyv o¨nen, and Petri Leskinen</source>
          .
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref54">
        <mixed-citation>
          <source>In Proceedings of the 20th International Conference on</source>
        </mixed-citation>
      </ref>
      <ref id="ref55">
        <mixed-citation>
          <source>ing (CICling</source>
          <year>2019</year>
          ). Springer-Verlag,
          <source>April. Accepted. Gonzalo Tartari and Aidan Hogan</source>
          .
          <year>2018</year>
          . WiSP: Weighted
        </mixed-citation>
      </ref>
      <ref id="ref56">
        <mixed-citation>
          <source>2018. CEUR Workshop Proceedings</source>
          , Vol-
          <volume>2187</volume>
          . Serge ter Braake, Ronald Sluijter Anstke Fokkens, Thierry
        </mixed-citation>
      </ref>
      <ref id="ref57">
        <mixed-citation>
          <string-name>
            <surname>Declerck</surname>
          </string-name>
          , and
          <string-name>
            <surname>Eveline</surname>
          </string-name>
          Wandl-Vogt, editors.
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref58">
        <mixed-citation>
          <string-name>
            <surname>BD2015</surname>
          </string-name>
          ,
          <source>Biographical Data in a Digital World</source>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref59">
        <mixed-citation>
          <source>CEUR Workshop Proceedings</source>
          , Vol-
          <volume>1399</volume>
          . Jouni Tuominen, Eero Hyvo¨ nen, and
          <string-name>
            <given-names>Petri</given-names>
            <surname>Leskinen</surname>
          </string-name>
          .
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref60">
        <mixed-citation>
          <article-title>data for prosopographical research</article-title>
          .
          <source>In BD-2017</source>
          , Bio-
        </mixed-citation>
      </ref>
      <ref id="ref61">
        <mixed-citation>
          <source>graphical Data in a Digital World</source>
          <year>2017</year>
          , pages
          <fpage>59</fpage>
          -
          <lpage>66</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref62">
        <mixed-citation>
          <source>CEUR Workshop Proceedings</source>
          , Vol-
          <volume>2119</volume>
          . Koenraad Verboven, Myriam Carlier, and Jan Dumolyn.
        </mixed-citation>
      </ref>
      <ref id="ref63">
        <mixed-citation>
          2007.
          <article-title>A short manual to the art of prosopography</article-title>
          . In
        </mixed-citation>
      </ref>
      <ref id="ref64">
        <mixed-citation>
          book, pages
          <fpage>35</fpage>
          -
          <lpage>70</lpage>
          . Unit for Prosopographical Research
        </mixed-citation>
      </ref>
      <ref id="ref65">
        <mixed-citation>
          2016.
          <article-title>Six Degrees of Francis Bacon: A Statistical</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref66">
        <mixed-citation>
          works.
          <source>DHQ: Digital Humanities Quarterly</source>
          ,
          <volume>10</volume>
          (
          <issue>3</issue>
          ). Christopher
          <string-name>
            <given-names>N.</given-names>
            <surname>Warren</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Historiography's two</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref67">
        <mixed-citation>
          <article-title>(ODNB)</article-title>
          .
          <source>Journal of Cultural Analytics, November</source>
          <volume>22</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref68">
        <mixed-citation>
          <source>doi:10</source>
          .31235/osf.io/rbkdh.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>