<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Considerations about Uniqueness and Unalterability for the Encoding of Biographical Data in Ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Thierry Declerck</string-name>
          <email>declerck@dfki.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rachele Sprugnoli</string-name>
          <email>sprugnoli@fbk.eu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DFKI GmbH, Multilingual Technologies Lab Stuhlsatzenhausweg 3</institution>
          ,
          <addr-line>D-66123 Saarbrücken</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Fondazione Bruno Kessler, Digital Humanities Group Via Sommarive</institution>
          ,
          <addr-line>18, I-38123 Povo</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>76</fpage>
      <lpage>82</lpage>
      <abstract>
        <p>This paper results from observations that have been made while studying ontological and linked data-based approaches to the encoding of biographical data. Based on certain issues we discovered and which will be described here, we aim to call for a collaborative work towards guidelines for modelling biographical data in the standard Semantic Web representation languages. The need for guidelines became even more clear after reading an article, which described various types of errors in biographical data encoding that have been generated due to an unsuitable use of the owl:sameAs property when referring to the linked data-based description of the life of two literary authors. In this context, there is also a need to agree on the core element of which a biographical description constitutes. More specifically, we aim to determine the “biographical unit”, which should be primarily modelled and to which all related information should be linked by using corresponding semantic properties. Apart from that, we will also discuss the need of the definition and use of synchronic versus diachronic properties associated with the modelled biographical unit. Regarding this point, we come to the conclusion that for the description of a biographical unit, there are probably no properties whose values remain unaltered over time. This is particularly true if the provenance information, that can provide contrasting values which, however, might be correct from different point of views, is taken into account.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontologies</kwd>
        <kwd>biographical units</kwd>
        <kwd>linked data</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>The issue of encoding changes occurring in a life has
received some attention in the context of the formal
representation of biographical data. In one study on this topic,
Krieger and Declerck (2015) present considerations on
synchronicity and diachronicity and how those aspects can be
applied for defining properties in a formal ontology about
biographical data. Building further on this study, one can
eventually come to the conclusion that it is very difficult, if
not impossible, to come up with unalterable property values
that can be associated with an individual within a
biographical description. This leads to the question whether there
are any properties of human beings which are in fact
immutable and which can therefore be used as the fixed pillar
on whose base we can describe all other changeable aspects
and characteristics of human beings.</p>
      <p>At the actual state of our study, this seems not to be given.
Let us take as an example the case of the soldier Manning,
introduced in Wikipedia as “Chelsea Elizabeth Manning
(born Bradley Edward Manning, December 17, 1987)”.1 In
this case, the question arises whether this person remains
the same after the change of sex, while we want to stick to
the assumption that only one entry for this biographical unit
should be kept.</p>
      <p>We can also be confronted with uncertainty when looking at
the birth date of a person, as this is an information that can
still be modified or corrected in dependency of new data,
also depending on the sources consulted. In addition to that,
it is sometimes not even possible to state which source is</p>
      <sec id="sec-1-1">
        <title>1https://en.wikipedia.org/wiki/Chelsea_</title>
        <p>
          Manning.
the more reliable one, so that we have to encode
biographical information mentioning its provenance, especially in the
cases where we do not have a unique value. In the end, we
only have the certainty that, biologically speaking, a person
was born only once, but that various birth dates can be
associated with this event, in dependency of perspectives and
provenance of the information.2
Our intuition is that a very carefully designed ontology can
offer support when dealing with a “biographical unit”. This
biographical unit might have no fixed characteristics, or
properties, but on the basis of the large set of possibly
divergent values of descriptors (classes and properties) and their
organisation in one ontological space, it can be considered
as one unique carrier of a life. This carrier should then be
uniquely identified by a URI.3
In this paper, we concentrate thus on biographical data as
giving an account of a person’s life and achievements, not
considering at this stage prosopography or what is
sometimes referred to as “collective biography”
          <xref ref-type="bibr" rid="ref5">(Davies and
Gannon, 2006)</xref>
          .
        </p>
        <p>In the next sections we will first report on existing
ontological modellisation initiatives for biographical data,
before briefly describing the Linked Open Data cloud and
2The same remark applies for sure to the death date of a
person, also biologically speaking, with the only difference that we
can have biographies of living people, where a death date does not
need to be specified, until the passing away of a person is being
described in a biographical data set.</p>
        <p>3This could also be an IRI (Internationalized Resource
Identifier), but the URI stresses the “Uniqueness” of the resource
identifier.
presenting the way biographical data is represented in this
framework. This will be followed by a discussion of the
paper by Brown and Simpson (2013), which describes
how erroneous biographical data can be generated in the
Linked Data framework due to the inappropriate use of the
owl:sameAs property. Finally, we will present our ideas
on how to overcome those issues, also calling for a
collaborative work in order to generate guidelines to describe
biographical data within the Linked Open Data cloud.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Overview of Existing Models for</title>
    </sec>
    <sec id="sec-3">
      <title>Biographical Data</title>
      <p>Several repositories of biographies are available in digital
format and specific schemata have been proposed to model
life events to improve the analysis and understanding of
these repositories.</p>
      <p>
        BIO4 describes a person’s life seen as a series of
interlinked events. Its vocabulary, expressed in OWL, has
four core classes: Person, Event, Relationship
and Interval. As for the Event class, BIO
proposes a framework of 37 event types: some of these
types apply to all people (e.g., Birth, Death), others
are more specific (e.g., Coronation, BarMitzvah).
Each event is characterised by four properties: Date,
Place, State (i.e., territory involved in an event), and
Position (i.e, employment position or public office).
Other properties are used to relate an event to an agent
(e.g., Employer, Officiator) or to temporally order
an event with respect to another event (e.g. Following
Event, Preceding Event). An extension of BIO has
been proposed within the Shoah Ontology, a domain
ontology that formally describes concepts and relationships
characterizing the life and persecution of Jews in Italy
between 1943 and 1945
        <xref ref-type="bibr" rid="ref1 ref11 ref8">(Brazzo and Mazzini, 2015)</xref>
        . Here,
the ontology class called Persecution is used to
represent all main events related to the persecution of the
victims (arrest, detention, deportation to a Nazi camp,
transfer to another camp, liberation, death in a massacre). This
class is connected to the Person class that is based on BIO
extended with additional anagrafic/genealogical properties
(e.g. niece_nephewOf).
      </p>
      <p>
        The aim of the Biography Light Ontology is
twofold
        <xref ref-type="bibr" rid="ref14">(Ramos, 2009)</xref>
        : i) encode life events following the
4W model, thus answering questions about what, where,
when, who; ii) improve the interoperability among
existing vocabularies such as LODE (Linking Open
Descriptions of Events)5 and BIO. Biography Light
introduces the main class BioEvent with four subclasses that
represent changes in the health of the biography’s
subject, his/her relations with other people, changes in
location such as migrations, and inventions or discoveries
made by the subject. Event properties are borrowed from
LODE (e.g., atPlace) and from the Event Ontology (e.g.,
isAgentIn). This ontology has been developed within
and adopted by the Bringing Lives to Light: Biography in
Context Project, an initiative of the Electronic Cultural
Atlas Initiative (ECAI)6 at the University of California. The
4http://vocab.org/bio/
5http://linkedevents.org/ontology/
6http://ecai.org/.
goal of the project was to design, develop and evaluate tools
that can improve the understanding of biographical texts by
connecting life events to contextual information, including
their location, time of occurrence and related archival
materials
        <xref ref-type="bibr" rid="ref3">(Buckland and Ramos, 2010)</xref>
        . Different datasets and
sources of information were taken into account during the
project: namely, the digital texts provided in the on-line
Biographical Directory of the United States Congress,7 the
manually compiled chronology of Emma Goldman’s
lecture itinerary,8 or the scanned page image of Irish texts.9
Bio CRM is a domain-specific extension of CIDOC CRM
        <xref ref-type="bibr" rid="ref6">(Doerr, 2003)</xref>
        : it provides a general model for
representing biographical datasets that can be extended to meet the
requirements of specific projects
        <xref ref-type="bibr" rid="ref16">(Tuominen, 2016)</xref>
        . This
ontology makes a clear distinction between unary roles of
actors, binary relations between actors and events in which
actors participate having different roles. Events are
described in terms of time, location, participants and other
resources involved; moreover, they are organised in an
hierarchy distinguishing, for example, ecclesiastical from
educational events. Each event type has a corresponding class
of permitted roles. Bio CRM has been developed by the
Semantic Computing Research Group of Aalto University
(Finland) within a set of experiments and projects focused
on the linking, enrichment and visualisation of biographies
with the aim of improving the reading experience of
biographies by providing the users with a rich reading
context. A first experiment, called National Semantic
Biography of Finland, takes the short biographies published in the
Finnish National Biography10 as input data and works on
a single type of event (i.e, achievements in the career of a
person). An event extractor is used to identify snippets of
texts containing words which express creation events, dates
written in numbers, named entities of type location and a
reference to the name of the subject person of the
biography. Extracted information is then transformed in RDF
following the Bio CRM model and linked to several external
resources such as GeoNames and Wikipedia (Hyvönen et
al., 2014). This approach has also been applied to the
digitised historical register of the Finnish high school “Norssi”,
which includes information about the student lives of more
than 10,000 alumni (Hyvönen et al., 2017).
      </p>
      <p>
        Biography.owl is a lightweight ontology designed to
represent biographical facts
        <xref ref-type="bibr" rid="ref1 ref11 ref8">(Krieger and Declerck, 2015)</xref>
        :
its main feature is the tri-partite structure which entities
are modelled with. More specifically, the most general
class Entity has three subclasses, that is Abstract
(describing concepts and roles), Object (describing physical
things) and Happening. The latter includes both
situations and events, the first being static and atomic, the
second dynamic and decomposable. Happenings have
properties related to their starting and ending date, the agents
involved in them, and their location. Particular attention is
devoted to pre- and post-conditions of a happening due to
properties encoding causes and effects.
      </p>
      <sec id="sec-3-1">
        <title>7http://bioguide.congress.gov/</title>
      </sec>
      <sec id="sec-3-2">
        <title>8See http://metadata.berkeley.edu/emma/ for a</title>
        <p>prototype.</p>
      </sec>
      <sec id="sec-3-3">
        <title>9For more details see http://ecai.org/neh2007/. 10https://kansallisbiografia.fi/english.</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussing Existing Models</title>
      <p>In this Section we discuss in more detail the approaches
towards synchronic versus diachronic properties proposed
by Krieger and Declerck (2015) and by the Biography Light
Ontology, both briefly introduced in the preceding Section.
Krieger and Declerck (2015) study how to classify relations
(or properties) associated with classes of an ontology as
being either synchronic or diachronic. The assumption behind
this approach is that a date of birth is something that will
not change over time (“a person is born only once”), while
the profession exercised by a person can vary over time.
While this study was mainly concerned with formalisation
aspects, one of the results was that it is in fact very
difficult, if not impossible to describe a property that will have
only an unalterable value. We can assume that, biologically
speaking, a person has indeed only one date of birth, but
the statements about this event can be multiple, depending
on the sources, or may be revised over time.</p>
      <p>
        Furthermore, interesting statements about “changes” in the
Biography Light Ontology could be found: “The
Biography Light Ontology takes an event centric approach to the
encoding of biographic texts. It is a lightweight framework
for common biographic occurrences, such as changes in
the health of a biographic subject, relationships between
the subject and other people, social groups, or institutions,
migration or the change of location of a subject, and
biographic events pertaining to creations, inventions, or
discoveries produced by the primary subject. The
Biography Light model introduces the event type bl:BioEvent
with four basic subclasses: bl:ChangeOfHealth,
bl:ChangeOfRelation, bl:ChangeOfLocation,
and bl:Origination”
        <xref ref-type="bibr" rid="ref14">(Ramos, 2009)</xref>
        . In other words,
some properties of the Biography Light Ontology carry the
name “Change”. Using this ontology, we can extract event
factoids modelled as instances of biographical event classes
from a biography, as in the example below, adapted from
        <xref ref-type="bibr" rid="ref14">(Ramos, 2009)</xref>
        :
      </p>
      <p>Text: Robert George Collier Proctor
(18681903), bibliographer, was born in Budleigh
Salterton, Devon, on 13 May 1868. He was
educated at a preparatory school in Reading and at
Marlborough College, before joining Bath
College in 1881.</p>
      <p>Event factoids:</p>
      <sec id="sec-4-1">
        <title>ChangeOfHealth:</title>
        <p>– birth, 1868-05-13, Budleigh Salterton,</p>
        <p>Devon</p>
      </sec>
      <sec id="sec-4-2">
        <title>ChangeOfSocialRelation:</title>
        <p>– studied at Marlborough College, before</p>
        <p>
          1881
– studied at Bath College in 1881
We can however assume that not only the properties
identified in the Biography Light Ontology can change, but all the
properties associated to a biographical entity are subjects to
changes. The proposal would thus be to equip all
properties with a time stamp (an instant or a duration).
Technically, this can be done by either allowing n-tuples
properties
          <xref ref-type="bibr" rid="ref12">(Krieger, 2014)</xref>
          or by “reifying” a statement about a
biographical unit. For example, hasHealthStatus can
vary very often over time: we can thus “reify” the statement
about the health status and encode it as a statement to be
equipped with a time stamp in the object part of the
resulting new triple. Wikidata11 is using this method for marking
the change of sex/gender by Bradley (later Chelsea)
Manning,12 but also for example for marking the number of
inhabitants of a city (Hernández et al., 2015). In addition to
that, the provenance of such information needs to be taken
into account and to be encoded properly, so that the user
can select between sources that seem to be more interesting
or more reliable. At this point, we can take advantage of
the work described by Ockeloen et al. (2013).13
4
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>The Linked Open Data Cloud</title>
      <p>One of our goal of porting a model for biographical data
into a Semantic Web compliant formal representation is to
be able to publish those data as a specialised subset of the
Linked Open Data (LOD) cloud. Figure 1 shows the shape
of this cloud, as of 2018-04-30.14
Looking at this cloud in more detail, the reader can see the
legend to the various colours used to mark the specialised
subsets of the Linked Data infrastructure: Cross Domain,
Geography, Government, Life Sciences, Linguistics, etc.
Biographical data is also present in LOD, but not (yet) in a
specialised subspace. For example, there are a lot of
biographical data encoded in the DBpedia node, which is
classified as “Cross_domain”.</p>
      <p>DBpedia15 started as an effort consisting in extracting
structured data from Wikipedia (mainly its “infoboxes”) and to
encode this information in a Semantic Web compliant
representation language. Nowadays, DBpedia is among the
largest nodes in the LOD cloud. DBpedia organises its data
on the basis of an ontology that was first developed starting
from the Wikipedia category system, which can be found in
the infoboxes and which evolved to a full ontology
representing a directed acyclic graph. This ontology contains
4,233,000 instances (resources), among which 1,450,000
are about entities classified as “Person” and many about
other topics that are inherently related to the description of
a person, like places, organisations, work, etc.16 The full
ontology is browsable17 and demonstrates that DBpedia is
making use of a large set of (ontological) properties that
can be used to describe a biography. Looking at the full
ontology, one can also see the details on the information
associated with a certain class, illustrating that, for
exam11https://www.wikidata.org/
12At https://www.wikidata.org/wiki/Q298423
the entity “Q298423” is marked as being “male” until 22 August
2013 and “transgender female” starting from 22 August 2013.
The change of given name of the “Q298423” entity is marked in
a similar way.</p>
      <p>13“Provenance” is also a W3C recommendation, see https:
//www.w3.org/TR/2013/REC-prov-dm-20130430/.
14http://lod-cloud.net/.
15See http://wiki.dbpedia.org/ for more details.
16For more details see http://wiki.dbpedia.org/
services-resources/ontology .</p>
      <p>17http://mappings.dbpedia.org/server/
ontology/classes/.
ple, 257 properties are introduced for the class Person.18
The information about domain and range of such
properties is given and we can also recognise the type of each
property, being either a data-type or an object-type
property. Looking at the data-type property birthDate,19 we
can see that the class “Person” is defined as its domain and
the xsd:date type as its range. This setting corresponds
to our intuition that only a Person can have a birth date, but
for example not a Group or even an Agent (a superclass
of Person in the DBpedia ontology). However, it is
important to note that the correct setting of domain and range
of properties is just ensuring the flow of information to be
inherited by the sub-classes of the class bearing the
properties, but it is not a restriction on the instances of the classes
that can be checked for avoiding inconsistencies.
DBpedia links its data to other knowledge sources using to
this end OWL constructs such as owl:sameAs. This
con18http://mappings.dbpedia.org/server/
ontology/classes/Person.</p>
      <p>19http://mappings.dbpedia.org/index.php/
OntologyProperty:BirthDate.
struct can cause problems and generate errors. An example
of this issue is given by Brown and Simpson (2013), which
will be described briefly in the following section.</p>
    </sec>
    <sec id="sec-6">
      <title>5 Issues with “Michael Field”</title>
      <p>Brown and Simpson (2013) describe problems with
bibliographical entries in the Linked Data context. More
concretely, it involves Katharine Harris Bradley and her
niece Edith Emma Cooper. Both are authors of poetry and
verse drama and formed a duo, for which they used the
pseudonym “Michael Field”. The use of pseudonyms is
not seldom and has many reasons: in this specific case the
choice of a pseudonym could have been motivated by the
fact that the authors had an intimate relationship.
“Hiding” themselves behind a pseudonym with a masculine
name might have been a strategy to avoid social
reprobation. In some knowledge sources the relation between
each of the literary author and the pseudonym “Michael
Field” is stated in such a way that the pseudonym is
inheriting a birth/death date, and in the end even two birth/death
dates,20 and meaning at the same time that each author is
being associated with two birth/death dates, when the
relation to the pseudonym is defined as a symmetric one. In
this case the un-reflected use of the owl:sameAs
property between one person and the pseudonym is enough for
generating the wrong data, and associating the properties
birthDate and deathDate to the pseudonym.
Nevertheless, defining a restriction of the ontology, stating that
only instances of the class Person can bear the properties
birthDate and deathDate would suffice for avoiding
this kind of problem.</p>
      <p>
        Some data sets in the LOD, such as DBpedia, introduce
“Michael Field” as an author.21 All the explanation texts in
the DBpedia page for “Michael Field” specify that the entry
is about a pseudonym but at the formal ontological level it
is introduced as a Person, which is wrong, as it should be
an instance of a class Pseudonym. The same error can be
observed in the Yago data set
        <xref ref-type="bibr" rid="ref15">(Suchanek et al., 2007)</xref>
        . In the
Yago data, we can even see that the name of the pseudonym
is segmented in a Given Name and a Family Name and that
the pseudonym is bearing a gender property, with value
“female”.22 We do think that this kind of information is not
appropriate for a pseudonym.
      </p>
      <p>The modelling in Wikidata seems to be more accurate,
as it introduces “Edith Emma Cooper” as an instance of
the class human and establishes a part_of relation to
the “Michael Field” instance of the class collective
pseudonym.23
We also noticed that DBpedia is making use of its property
dbo:wikiPageRedirects in order to get to the page
http://dbpedia.org/page/Michael_Field\
_(author) when querying for “Edith_Emma_Cooper”.</p>
      <sec id="sec-6-1">
        <title>While the property dbo:wikiPageRedirects is</title>
        <p>an extremely useful feature helping to normalize
variants in names and then pointing them to the right
DBpedia page, it is rather cumbersome in the case of
“Edith_Emma_Cooper”, as it would be better to land on
the page describing her and not on a page that deals with
the pseudonym she is sharing with another author.
6</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>On the “Biographical Unit”</title>
      <p>As stated above, we did not find a property that can be
considered as having a stable value in order to characterise a
core element of an entry in a biographical dataset. By now,
our intuition is that we just have to declare a class Person,
being a “life carrier” and having a temporal span, to which
all kind of relevant biographical properties can be assigned.
Instances of this class are uniquely addressed by URI. This
results in a highly abstract model.</p>
      <p>
        The core element of a biographical unit in such an
ontology being a URI, we strongly discourage the use of the
owl:sameAs property for linking to this unit. The very
negative results of applying such a property to the
description of entries in a biography in a linked data environment
have been precisely and accurately documented in Brown
and Simpson (2013), as we reported in Section 5.
Modelling the “Michael Field” data
We started our modelling experiment by encoding the
biographical data described in Brown and Simpson
        <xref ref-type="bibr" rid="ref2">(Brown
and Simpson, 2013)</xref>
        in order to investigate how we could
avoid the problems described in that paper. In particular,
we developed OWL/RDF(s) code taking as a starting point
the Biographical Ontology
        <xref ref-type="bibr" rid="ref1 ref11 ref8">(Krieger and Declerck, 2015)</xref>
        .
Figure 2 depicts the basic class hierarchy we are using for
modelling the “Michael Field” data also used by Brown
and Simpson (2013). In this figure the small number of
instances we have included are indicated in parentheses,
which are basically the people named in
        <xref ref-type="bibr" rid="ref2">(Brown and
Simpson, 2013)</xref>
        .
      </p>
      <p>The code in Listing 1 displays the way we apply a
restriction to the class bio:Person, where we state that
at least one date of birth has to be given, while we also state
that the property bio:dateOfDeath is defined for this
class.24 The associated properties bio:dateOfBirth
and bio:dateOfDeath not listed here are defined for
domain bio:Person and range xsd:date, similar to
the related properties in DBpedia or Wikidata.</p>
      <p>24The definition of this class will for sure be updated to include
information about provenance. We will also add a constraint
stating that within a time period, to be counted from the birth date, a
death date has to be given.
.
.
.</p>
      <p>Listing 1: The class bio:Person
bio:Person
rdf:type owl:Class ;
rdfs:subClassOf bio:Agent ;
rdfs:subClassOf [
rdf:type owl:Restriction ;
owl:minCardinality "1"^^xsd:date ;
owl:onProperty &lt;http://www.dfki.de/lt/
onto/biography.owl#bio:dateOfBirth&gt;
;
] ;
rdfs:subClassOf [
rdf:type owl:Restriction ;
owl:onProperty &lt;http://www.dfki.de/lt/
onto/biography.owl#bio:dateOfDeath&gt;
;
] ;
owl:disjointWith bio:State ;
The code in Listing 2 introduces “MichaelField” as an
instance of the Class bio:ArtisticGroup. This group
consists of two instances of the class Person, which are
described in the listings 4 and 5 below. It is important to
note that it is the specific group, which is associated with
the pseudonym bio:Pseudonym_1 (“Michael Field”).
None of the authors alone should be associated with the
pseudonym, as it was the case in certain data sets in the
LOD cloud.25</p>
      <p>Listing 2: An instance of bio:ArtisticGroup
bio:MichaelField
rdf:type bio:ArtisticGroup ;
bio:hasActivity bio:Writer ;
bio:hasMember bio:Woman_1 ;
bio:hasMember bio:Woman_3 ;
bio:hasPseudonym bio:Pseudonym_1 ;
rdfs:label "\"Michael Field\""@en ;
The code in Listing 3 introduces “Michael Field” as an
instance of the class bio:CollectivePseudonym.</p>
      <p>Listing 3: An instance of bio:CollectivePseudonym
bio:Pseudonym_1
rdf:type bio:CollectivePseudonym ;
bio:hasActivity bio:Writer ;
bio:hasName "Michael Field" ;
The code in Listing 4 and Listing 5 below concerns the two
authors involved in both the artistic duo with the associated
pseudonym, but also related to each other by both a familiar
and an intimate relation.</p>
      <p>Listing 4: An instance of bio:Person
bio:Woman_1
rdf:type bio:Woman ;
bio:dateOfBirth "1846-10-27"^^xsd:date ;
bio:dateOfDeath "1914-09-26"^^xsd:date ;
bio:hasActivity bio:Writer ;
25It is also to be mentioned that each member of this group also
had an own pseudonym, which we do not display here, for reason
of space.
bio:hasFirstName "Katherine Harris" ;
bio:hasLastName "Bradley" ;
bio:hasLover bio:Woman_3 ;
bio:hasSister bio:Woman_2 ;
bio:isMemberOf bio:MichaelField ;
.
.</p>
      <p>Listing 5: An instance of bio:Person
bio:Woman_3
rdf:type bio:Woman ;
bio:dateOfBirth "1862-01-12"^^xsd:date ;
bio:dateOfDeath "1913-12-13"^^xsd:date ;
bio:hasActivity bio:Writer ;
bio:hasFather bio:Man_1 ;
bio:hasFirstName "Edith Emma" ;
bio:hasLastName "Cooper" ;
bio:hasLover bio:Woman_1 ;
bio:hasMother bio:Woman_2 ;
bio:isMemberOf bio:MichaelField ;
With this draft encoding our aim was to show how to avoid
the issues described by Brown and Simpson (2013) who
stress the need to have both a generic ontological
framework for describing entities, but also a very specific
encoding scheme for accurately modelling all aspects and
subtleties of biographical data.</p>
    </sec>
    <sec id="sec-8">
      <title>7 Towards a Sub-cloud of the LOD</title>
    </sec>
    <sec id="sec-9">
      <title>Dedicated to Biographical Data</title>
      <p>
        Based on the observations we could make on the diverse
efforts to encode biographies in a Semantic Web compliant
format, which have been described in Section 2, Section 3
and Section 4, we see the need for reaching a wide
consensus on this ontological design, exploring and possibly
reusing existing biography vocabularies and ontologies.
In order to achieve this aim, we can build on the “Shared
Data Model” initiative
        <xref ref-type="bibr" rid="ref7">(Fokkens and ter Braake, 2018)</xref>
        ,26
which was put in place at the DH Biographical Data
Workshop held at the Digital Humanities 2016 conference.27 We
expect that generally accepted guidelines for the
ontological encoding of biographical data can be derived from this
moderated collection of data models.
      </p>
      <p>In addition, we are advocating for a collaborative effort
dedicated to establish a specialised sub-cloud of the LOD
framework dedicated to data sets containing biographical
data. In this way redundancies and inconsistencies in the
modelling of biographical data could be avoided and the
modelling of such data could also get a more salient
position and an improved visibility in the LOD.</p>
      <p>
        This community group could be organised in a similar
manner to the W3C Community Group for the representation
of language data in relation to ontologies and to the OKFN
26This is a moderated collaborative effort for sharing data
models in the field of biography, resulting in a “Repository
for Biographical Data Models”
        <xref ref-type="bibr" rid="ref7">(Fokkens and ter Braake, 2018)</xref>
        ,
which can be accessed at https://github.com/cltl/
BiographicalDataModels.
      </p>
      <p>27http://www.biographynet.nl/
dh-biographical-data-workshop/.</p>
      <p>Working Group on Linguistics28 for building a domain
specific subset of the Linked Data cloud, in this case the LLOD
cloud.29
8</p>
    </sec>
    <sec id="sec-10">
      <title>Conclusions</title>
      <p>Based on our study of existing ontological models for
biographical data, we came to the conclusion that it seems
impossible to find one property of a human being that can
remain stable within its lifespan. This has consequences
on the modelling work, as we need to precisely define what
constitutes the uniqueness of an entry in a biographical data
set. We advocate for a solution, which consists in
introducing a URI for each entry, which needs to be equipped
fundamentally with two properties describing the dates of
birth and of death. All values to be given to those (and other
related) properties are mutable and can also vary in
dependency of the provenance information, that also needs to be
encoded in the biographical data set.</p>
      <p>Furthermore, we came across reports that detail errors in
the encoding of biographical data in the Linked Data cloud
and which were generated by the inappropriate use of
ontological properties and vocabularies. This situation calls for
the building of more collaborative work in the field of
ontological modelling of biographical data and possibly also
for a W3C Community Group dedicated to the creation of
a biography specific sub-cloud in the LOD framework.</p>
    </sec>
    <sec id="sec-11">
      <title>Acknowledgement</title>
      <p>The DFKI contribution to this paper was partly
supported by the H2020 project QT21 with agreement
number 645452. We thank the anonymous reviewers of the first
version of this paper for their very helpful comments. Our
thanks go also to Eileen Schnur for proofreading and
improving our text.</p>
      <p>The paper is dedicated to the memory of Hans-Ulrich
Krieger who unfortunately passed away in June 2017. He
was the initiator of our efforts in this field and published the
first version of the DFKI biography ontology.</p>
      <p>9</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Laura</given-names>
            <surname>Brazzo</surname>
          </string-name>
          and
          <string-name>
            <given-names>Silvia</given-names>
            <surname>Mazzini</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>From the Holocaust Victims Names to the Description of the Persecution of the European Jews in Nazi Years: the Linked Data Approach and a New Domain Ontology</article-title>
          .
          <source>In Book of abstract of DH</source>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>S.</given-names>
            <surname>Brown</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Simpson</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>The curious identity of michael field and its implications for humanities research with the semantic web</article-title>
          .
          <source>In 2013 IEEE International Conference on Big Data</source>
          , pages
          <fpage>77</fpage>
          -
          <lpage>85</lpage>
          , Oct.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Michael</given-names>
            <surname>Buckland</surname>
          </string-name>
          and Michele Renee Ramos.
          <year>2010</year>
          .
          <article-title>Events as a structuring device in biographical mark-up and metadata</article-title>
          .
          <source>Bulletin of the Association for Information Science and Technology</source>
          ,
          <volume>36</volume>
          (
          <issue>2</issue>
          ):
          <fpage>26</fpage>
          -
          <lpage>29</lpage>
          . 28See https://www.w3.org/
          <year>2016</year>
          /05/ontolex/
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>working-groups/wg-linguistics/</article-title>
          . 29http://linguistic-lod.org/llod-cloud for
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Bronwyn</given-names>
            <surname>Davies</surname>
          </string-name>
          and
          <string-name>
            <given-names>Susanne</given-names>
            <surname>Gannon</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Doing collective biography: Investigating the production of subjectivity. McGraw-Hill Education (UK).</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Martin</given-names>
            <surname>Doerr</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>The cidoc conceptual reference module: an ontological approach to semantic interoperability of metadata</article-title>
          .
          <source>AI magazine</source>
          ,
          <volume>24</volume>
          (
          <issue>3</issue>
          ):
          <fpage>75</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Antske</given-names>
            <surname>Fokkens</surname>
          </string-name>
          and Serge ter Braake.
          <year>2018</year>
          .
          <article-title>Connecting people across borders: a repository for biographical data models</article-title>
          .
          <source>In Proceedings of the 2nd conference on Biographies in a Digital World.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Hernández</surname>
          </string-name>
          , Aidan Hogan, and
          <string-name>
            <given-names>Markus</given-names>
            <surname>Krötzsch</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Reifying RDF: what works well with Wikidata</article-title>
          ?
          <source>In Proceedings of the 11th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS</source>
          <year>2015</year>
          ), volume
          <volume>1457</volume>
          <source>of CEUR Workshop Proceedings. CEUR-WS.org.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Eero</given-names>
            <surname>Hyvönen</surname>
          </string-name>
          , Miika Alonen, Esko Ikkala, and
          <string-name>
            <given-names>Eetu</given-names>
            <surname>Mäkelä</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Life stories as event-based linked data: case semantic national biography</article-title>
          .
          <source>In Proceedings of the 2014 International Conference on Posters &amp; Demonstrations Track-Volume</source>
          <volume>1272</volume>
          , pages
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          . CEUR-WS. org.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Eero</given-names>
            <surname>Hyvönen</surname>
          </string-name>
          , Petri Leskinen, Erkki Heino, Jouni Tuominen, and
          <string-name>
            <given-names>Laura</given-names>
            <surname>Sirola</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Reassembling and enriching the life stories in printed biographical registers: Norssi high school alumni on the semantic web</article-title>
          .
          <source>In International Conference on Language, Data and Knowledge</source>
          , pages
          <fpage>113</fpage>
          -
          <lpage>119</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Hans-Ulrich Krieger</surname>
            and
            <given-names>Thierry</given-names>
          </string-name>
          <string-name>
            <surname>Declerck</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>An owl ontology for biographical knowledge. representing time-dependent factual knowledge</article-title>
          .
          <source>In Proceedings of the First Conference on Biographical Data in a Digital World</source>
          <year>2015</year>
          .
          <article-title>CEURS-WS.org, 7</article-title>
          . Online-Proceedings: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1399</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Hans-Ulrich Krieger</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>A detailed comparison of seven approaches for the annotation of time-dependent factual knowledge in rdf and owl</article-title>
          .
          <source>In Proceedings of the 10th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (held in conjunction with LREC</source>
          <year>2014</year>
          ).
          <source>European Language Resources Association.</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Niels</given-names>
            <surname>Ockeloen</surname>
          </string-name>
          , Antske Fokkens, Serge Ter Braake, Piek Vossen, Victor De Boer, Guus Schreiber, and
          <string-name>
            <given-names>Susan</given-names>
            <surname>Legêne</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Biographynet: Managing provenance at multiple levels and from different perspectives</article-title>
          .
          <source>In Proceedings of the 3rd International Conference on Linked Science - Volume 1116, LISC'13</source>
          , pages
          <fpage>59</fpage>
          -
          <lpage>71</lpage>
          , Aachen, Germany, Germany. CEUR-WS.org.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Michele R.</given-names>
            <surname>Ramos</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Biography Light Ontology: An Open Vocabulary For Encoding Biographic Texts</article-title>
          .
          <source>Technical report</source>
          , Bringing Lives to Light: Biography in Context Project.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Fabian M Suchanek</surname>
            ,
            <given-names>Gjergji</given-names>
          </string-name>
          <string-name>
            <surname>Kasneci</surname>
            , and
            <given-names>Gerhard</given-names>
          </string-name>
          <string-name>
            <surname>Weikum</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Yago: a core of semantic knowledge</article-title>
          .
          <source>In Proceedings of the 16th international conference on World Wide Web</source>
          , pages
          <fpage>697</fpage>
          -
          <lpage>706</lpage>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Jouni</given-names>
            <surname>Tuominen</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Bio CRM: A Data Model for Representing Biographical Information for Prosopography</article-title>
          .
          <source>Version</source>
          <year>2016</year>
          -
          <volume>08</volume>
          -
          <fpage>19</fpage>
          .
          <source>Technical report</source>
          , Bringing Lives to Light: Biography in Context Project.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>