<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Bio CRM: A Data Model for Representing Biographical Data for Prosopographical Research</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jouni Tuominen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eero Hyvo¨ nen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Petri Leskinen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>HELDIG - Helsinki Centre for Digital Humanities, University of Helsinki</institution>
          ,
          <country country="FI">Finland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Semantic Computing Research Group (SeCo), Aalto University</institution>
          ,
          <country country="FI">Finland</country>
        </aff>
      </contrib-group>
      <fpage>59</fpage>
      <lpage>66</lpage>
      <abstract>
        <p>Biographies make a promising application case of Linked Data: they can be used, e.g., as a basis for Digital Humanities research in prosopography and as a key data and linking resource in semantic Cultural Heritage (CH) portals. In both use cases, a semantic data model for harmonizing and interlinking heterogeneous data from different sources is needed. This paper presents such a data model, Bio CRM, with the following key ideas: 1) The model is a domain specific extension of CIDOC CRM, making it applicable to not only biographical data but to other CH data, too. 2) The model makes a distinction between enduring unary roles of actors, their enduring binary relationships, and perduing events, where the participants can take different roles modeled as a role concept hierarchy. 3) The model can be used as a basis for semantic data validation and enrichment by reasoning. 4) The enriched data conforming to Bio CRM is targeted to be used by SPARQL queries in a flexible ways using a hierarchy of roles in which participants can be involved in events.</p>
      </abstract>
      <kwd-group>
        <kwd>Linked Data</kwd>
        <kwd>Data models</kwd>
        <kwd>Biographical representation</kwd>
        <kwd>Event-based modeling</kwd>
        <kwd>Role-centric modeling</kwd>
        <kwd>Prosopography</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Event-based Approach for Biographies</title>
      <p>
        The underlying idea of this paper is to represent life
stories of people as Linked Data, extracted and aggregated
from heterogenous distributed data sources, such as
dictionaries of national biographies, museum collections, library
databases, Wikipedia etc.
        <xref ref-type="bibr" rid="ref12 ref19">(Hyvo¨nen et al., 2018)</xref>
        . Linked
biographical data facilitates studying enriched individual
life stories in biography research
        <xref ref-type="bibr" rid="ref24">(Roberts, 2002)</xref>
        as well as
in prosopography research on groups of people
        <xref ref-type="bibr" rid="ref14 ref31">(Verboven
et al., 2007; Keats-Rohan, 2007)</xref>
        . This paper addresses
the fundamental technical research question that has to be
solved in this kind of work: how to model life stories, so
that they can be enriched from heterogeneous data sources
and interlinked with each other in a semantically
interoperable way?
Our research hypothesis is that a good choice for data
modeling and harmonization is the event-based approach
where a person’s life is seen as a sequence of
spatiotemporal, interlinked events from birth to death—a person may
also be involved in prenatal and posthumous events. For
example, metadata about a painting in a gallery actually
means that there has been a painting event, and this could
be included in the timeline of the artist’s semantic
biography. Event-based modeling and ontologies have already
been found useful for harmonizing heterogeneous cultural
heritage data. A most notable and widely used
ontology for this is the CIDOC Conceptual Reference Model
(CRM)1
        <xref ref-type="bibr" rid="ref4">(Doerr, 2009)</xref>
        , but there are also other models
        <xref ref-type="bibr" rid="ref22 ref26 ref27 ref30 ref31 ref9">(Raimond and Abdallah, 2007; Scherp et al., 2009; Shaw, 2010;
van Hage et al., 2011)</xref>
        .
      </p>
      <p>
        A recurring problem in event based modeling is, however,
that it is not necessarily clear what is an event, since many
relations and roles
        <xref ref-type="bibr" rid="ref16">(Kozaki et al., 2006)</xref>
        occur in time and
space, too. For example: Are family relationships events,
e.g., being the father of or being married to someone? Are
professions events, such as being a president of a country,
because holding an office occurs in time and space with an
agent involved?
For example, the concept of “bishop” would be useful in
representing and querying biographical data, but what does
being a bishop actually mean? Is there a class and
instances of bishops, is being a bishop a property of a
person or a role, or how does the concept relate to the event
of holding a bishop’s office? Obviously, being a bishop
can be represented in different ways, but then harmonizing
and querying of data about bishops becomes very difficult
since the user cannot be sure in what alternative ways
being a bishop is actually represented. On the other hand,
we clearly need foundational ontological structures
        <xref ref-type="bibr" rid="ref6">(Guarino and Welty, 2002)</xref>
        for representing pieces of
heterogeneous knowledge in a systematic and unique way, but on
the other hand, there is a need for simple conceptualizations
and property structures for querying the data and
representing the data for the human user.
      </p>
      <p>To address these problems, this paper introduces a data
model, Bio CRM, for harmonizing, enriching, and using
biographical linked data based on events. Bio CRM is an
extension of CIDOC CRM to the biographical domain. This
ISO standard was selected as the basis because it is the most
widely used ontology standard for event-based modeling
in museums and has been integrated with the Functional
Requirements for Bibliographic Record (FRBR) family of
modeling standards in libraries2. Data from museums and
libraries are essential in describing life stories.</p>
      <sec id="sec-1-1">
        <title>2https://www.ifla.org/</title>
        <p>about-the-frbr-review-group
In the following, major use cases for Bio CRM in
biographical and prosopographical research are first listed. After
this, the design principles and the actual data model are
presented, with three online applications illustrating the use of
the system. Finally, related work is discussed with a
comparison of Bio CRM with other data models.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Prosopographical Method</title>
      <p>
        The aim of using the Bio CRM data model in our case
studies is to facilitate using the prosopographical research
method
        <xref ref-type="bibr" rid="ref31">(Verboven et al., 2007, p. 47)</xref>
        that consists of two
major steps. First, a target population of people is selected
that share desired characteristics for solving the research
question at hand. For example, our research question may
be related to social networks of men who were born in
Finland 1800–1900 and were artists. Second, the target group
is analyzed further, and compared with other groups, in
order to solve the research question.
      </p>
      <sec id="sec-2-1">
        <title>1. Determine target groups Target groups can be found</title>
        <p>by data filtering with a human in the loop. For example,
SPARQL SELECT can be used to create a tabular set of
selected instances. In our case, faceted search is a promising
option for filtering out target groups in a flexible and
dynamic way. An interesting possibility for further research
would be to try to do filtering automatically using
knowledge discovery.</p>
        <p>Once a target group has been determined, specific working
hypotheses and specific historical questions concerning the
group can be formulated and analyses performed.
2. Prosopographical analysis Linked data and SPARQL
querying provides many possibilities for analyzing target
group data. For example, it is possible to analyze the
structure and changing composition of the group in time and the
changing roles of individuals or subgroups. In this case the
result of a SPARQL SELECT or CONSTRUCT is analyzed
further by specific algorithms or visualization tools, such as
information graphics.</p>
        <p>
          Another option is to employ methods of network analysis
methods and tools
          <xref ref-type="bibr" rid="ref21 ref5 ref7">(Easley and Kleinberg, 2010; Hanneman
and Riddle, 2005)</xref>
          and visualizations
          <xref ref-type="bibr" rid="ref1 ref15 ref17 ref8">(Dadzie and Rowe,
2011; Kehrer and Hauser, 2013)</xref>
          . In this case, for example,
a SPARQL CONSTRUCT can be used for creating an RDF
network based on the underlying data.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Representing Biographies as Linked Data</title>
      <p>
        To aggregate, enrich, and link biographical data with
related datasets the data must be made semantically
interoperable, either by data alignments (using, e.g., Dublin Core
and the dumb-down principle) or by data transformations
into a harmonized form
        <xref ref-type="bibr" rid="ref10">(Hyvo¨nen, 2012)</xref>
        . In our case we
selected the data harmonization approach and the
eventcentric CIDOC CRM ISO standard as the ontological basis,
since biographies are based on life events. CIDOC CRM
provides a common and extensible semantic framework for
representing cultural heritage information, operating as a
”semantic glue” for integration, mediation, and interchange
of heterogeneous datasets from, e.g., museums, libraries,
and archives. In our work, biographies are modeled as
collections of CIDOC CRM events, where each event is
characterized by the 1) actors involved, 2) place, 3) time, and 4)
the event type. Bio CRM extends CIDOC CRM by
introducing role-centric modeling. The reason for the extension
is that while CIDOC CRM does include a mechanism for
representing roles of participants in events, its encoding in
RDF is complex and still in experimental phase (see
Section 5 for further discussion).
      </p>
      <p>Bio CRM provides the general data model for
biographical datasets. The individual datasets concerning different
cultures, time periods, or collected by different researchers
may introduce extensions for defining additional event and
role types. The Linked Data approach enables connecting
the biographies to contextualizing information, such as the
space and time of biographical events, related people,
historical events, publications, and paintings.</p>
      <p>The core design principles of the data model are:
The model is a domain specific extension of CIDOC
CRM, making it applicable to not only biographical
data but to other Cultural Heritage (CH) data, too.
The model makes a distinction between enduring
unary roles of actors, their enduring binary
relationships, and perduing events, where the participants can
take different roles modeled as a role concept
hierarchy.</p>
      <p>The model can be used as a basis for semantic data
validation and enrichment by reasoning.</p>
      <p>The enriched data conforming to Bio CRM is targeted
to be used by SPARQL queries in a flexible ways
using a hierarchy of roles in which participants can be
involved in events.</p>
      <p>Bio CRM makes a clear distinction between person’s
attributes, relations between people, and events in which
people participate in different roles.</p>
      <p>Attributes are properties of a person that are assumed
to characterize her independently of time and space.
For example, place and time of birth can be modeled
as attributes.</p>
      <p>Relations are established between people and are
assumed to characterize them independently of time and
space. For example, father-of is such a relation.
Relations can, however, have time and space as
qualifiers, e.g., student-of. For example, Ferdinand Bol
(1616–1680) was a student of Rembrandt in 1630–
1641, starting his own studio in 1641, but can be
characterized as a student-of Rembrandt in general. His
years in Rembrandt’s studio as a student can be
represented as an event (see below), if needed.</p>
      <p>Events take place in time and space and involve
participants in different roles, expressing the ways in which
persons participate in events. For example, an officiant
may participate in a certain baptism event.</p>
      <p>The core classes and properties of Bio CRM are presented
in Figure 1. The namespace of the Bio CRM schema is
http://ldf.fi/schema/bioc/, here used with the
prefix bioc. The full specification of Bio CRM (class and
property listing) is available in the namespace URI.
Similary, the prefix cidoc is used for CIDOC CRM’s
namespace http://www.cidoc-crm.org/cidoc-crm/.
A central focus in representing biographical data is
representing people and their networks. A person is
represented as an instance of bioc:Person, a subclass of
cidoc:E21 Person. This instance-of relationship is
persistent and never changes during the life of the person.
In order to identify a person, the person is associated with
core data: appellations, i.e., names and identifiers in other
data repositories, birth time and place, and death time and
place, using CIDOC CRM. Person’s birth and death are
represented as a Birth/Death event, which can be qualified
with time and place. Birth can also incorporate information
about the mother and father.</p>
      <p>In addition to the core data, a person can also have other
attributes, relationships, and participate in events. Having a
role, say Teacher, may be temporary or something inherent
characterizing a person as a whole in all times, even if it is
possible also to specify when exactly the role was present
(e.g., a professorship). Being a Teacher by education is
different from saying that the person happened to participate
in a particular teaching event, e.g., gave a lecture, in the
role of a Teacher.</p>
      <p>Genders, nationalities, and occupations of people are
represented by relating a person to a unary role using the
property bioc:bearer of. Figure 2 depicts an
example of John F. Kennedy in the role of President. The
role (blank node, as there is no need to give a
identity to it in this case) is attached to the a Person
using the bioc:has occupation relation (subproperty
of bioc:bearer of). While this expresses the gender,
nationality, or occupation generally, it’s also possible to
qualify the roles in time and and space by attaching a
contextualizing event, e.g., the employment of a person. This
is useful, as people have different roles during their life
that typically perdue a shorter period of time and may have
other qualifiers, too. For example, John Kennedy had the
role of President in the US in 1961–1963.</p>
      <p>The unary roles of Bio CRM are divided into the following
class hierarchy:
bioc:Unary_Role
bioc:Gender
bioc:Nationality
bioc:Occupation
The role class hierarchy can be further extended in
individual datasets, e.g., by listing the prevalent occupations in a
certain cultural era.</p>
      <p>The same role-based pattern is used for representing
inherent relationships between people, such as family
relations (mother, cousin, aunt, etc.) and social relations
(studentOf, knows, etc.). Relationships are represented
by relating an actor (a person or group) to another actor
in a role by using one of the subproperties of the
property bioc:has relation. The role is attached to the
another actor with the property bioc:inheres in
(inverse property of bioc:bearer of). Figure 3 depicts
an example of John F. Kennedy having a spouse Jacqueline
Kennedy Onassis. Similarly to unary roles, relationships
can be qualified with temporal and spatial information by
using an event to contextualize the role. A person may
have been some point a Spouse, a Lawyer in a company,
and a President of a country, possibly several times at
different occasions. For example, John Kennedy was Spouse
of Jacqueline Kennedy Onassis in 1953–1963.</p>
      <p>The binary roles of Bio CRM are divided according the
following class hierarchy:
bioc:Binary_Relationship_Role
bioc:Person_Relationship_Role
bioc:Family_Relationship_Role
bioc:Social_Relationship_Role
bioc:Intergroup_Relationship_Role
bioc:Group_Relationship_Role
Similary as for unary roles, the binary role classification
can be extended in individual datasets.</p>
      <p>The individual events of biographies are represented
as subclasses of bioc:Event that is a subclass of
cidoc:E5 Event inheriting its properties. From a
semantic viewpoint, events are described especially in terms
of
the time of the event
(cidoc:P4 has time-span),
place of the event (cidoc:P7 took place at),
actors that participated in it
(cidoc:P11 had participant),
other resources involved
(cidoc:P12 occurred in the presence of).
Time and place properties refer directly to time spans and
instances of places, respectively. The values for
participating actors and other resources are instances of role classes.
An actor role associates an actor with a role, making it
possible for a person to participate in events in different roles
that can also be qualified in terms of additional properties.
Similarly to actors, physical objects and immaterial things
can be involved in an event in specific roles.</p>
      <p>Events can be used for qualifying a unary (e.g., an
occupation) or a binary relation further, i.e., in such cases an
instance of bioc:Event has to be created. As an example,
Figure 4 represents the presidency of John F. Kennedy
qualified with time and the country. Another example in
Figure 5 depicts the marriage of John F. Kennedy and
Jacqueline Kennedy Onassis qualified with time. The individual
datasets may introduce their own classifications of event
types and associated roles.</p>
      <p>By using roles, it is possible to keep the number of
properties smaller, because different properties for different roles
are not not needed. Instead, different role classes are used.
Such a model is simpler to query using SPARQL and
provides the user with a set of useful and natural hierarchy of
role concepts.</p>
      <p>Possible roles that can be attached to certain
event types are specified using the OWL
restriction owl:AllValuesFrom on property
cidoc:P11 had participant. This can be used for
validating data, i.e., to see if the events have participants
in incompatible roles. It is recommended that each event
class, say Baptism, has a corresponding class of allowed
roles, say Baptism Actor Role. Its subclasses are
roles whose instances can be used in filling the roles. In
this way, the data annotator can be guided to use only the
correct roles, and the new role class can be used for finding
resources in roles when querying. The role hierarchy
facilitates sharing roles between events and modifying
their role structure easily by just editing the role hierarchy.
This is more flexible than, e.g., changing property names,
if roles were represented using different properties.
The following SPARQL query is an example for finding
all ”bishops” in a dataset. Note that because of the chosen
role modeling approach, the query returns both the bishops
as unary roles (occupation) and acting bishops in specific
events (e.g., a confirmation). The namespace prefix
declarations are omitted from the query for brevity.</p>
      <p>SELECT ?person ?name ?event_title</p>
      <p>?time ?place
WHERE {
?role a :Bishop ;</p>
      <p>bioc:inheres_in ?person .
?person a bioc:Person ;
cidoc:P131_is_identified_by</p>
      <p>?appellation .
?appellation rdfs:label ?name .</p>
      <p>OPTIONAL {
?event cidoc:P11_had_participant ?role ;
rdfs:label ?event_title ;
cidoc:P4_has_time-span ?time ;
cidoc:P7_took_place_at ?place .
}</p>
      <p>}</p>
    </sec>
    <sec id="sec-4">
      <title>4. Bio CRM Case Studies</title>
      <p>In the following, three case studies for using Bio CRM are
presented.</p>
      <sec id="sec-4-1">
        <title>4.1. Early Modern Letters Online (EMLO)</title>
        <p>Bio CRM was originally developed as a spin-off case
study related to the database and web service Early
Modern Letters Online (EMLO)3. EMLO is a collaboratively
populated union catalogue of sixteenth-, seventeenth-,
and eighteenth-century letters, created by the Cultures of
Knowledge project4 at the University of Oxford. It brings
manuscript, print, and electronic resources together in one
space, increasing access to and awareness of them, and
allows disparate and connected correspondences to be
crosssearched, combined, analyzed, and visualized.</p>
        <p>In addition to purely epistolary data, EMLO contains
prosopographical information related to the people in</p>
        <sec id="sec-4-1-1">
          <title>3http://emlo.bodleian.ox.ac.uk 4http://www.culturesofknowledge.org</title>
          <p>
            the database, modeled as events and social relationships.
Events cover activities that the people have participated
in during their lives, such as birth and death, ecclesiastic
and educational activities, creations of works, travels, and
residences. The event metadata includes the event name,
type, participants and their roles, time span, location, and
source information. As a pilot Linked Data publication
of the EMLO database
            <xref ref-type="bibr" rid="ref29">(Tuominen et al., 2018)</xref>
            , we have
converted the prosopographical data into RDF format
using CIDOC CRM for the event-based modeling and W3C’s
PROV model
            <xref ref-type="bibr" rid="ref17">(Lebo et al., 2013)</xref>
            for representing the roles
of participants in the events.
          </p>
          <p>As a next step, we propose to convert the data into a Bio
CRM representation, and build the event and role
hierarchies pertaining to the activity types stored in the database.
The top levels of the event hierarchy of the EMLO database
are the following ones:
bioc:Event
:Ecclesiastical_Event
:Educational_Event
:Political_Event
:Professional_Event
:Social_Status_Change
The class :Ecclesiastical Event can be divided
further into subclasses with attached roles, such as:
:Baptism – :Officiant, :Baptismal Candidate,
:Godfather, :Godmother, :Religion
:Confirmation – :Officiant,
:Confirmation Candidate, :Religion</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. The register of the high school ”Norssi” alumni</title>
        <p>
          Bio CRM has been applied in the study of transforming
printed biographical registers into Linked Data and
enriching their contents using Named Entity Linking. As a
concrete case study, we have concentrated on the printed
register “Norssit 1867-–1992. Helsingin Norssin matrikkeli”,
a book of 708 pages, containing short bios of over 10 000
students and teachers of the prominent Finnish high school
“Norssi”, a training school of the University of Helsinki.
The final application in use5 based on this case study is
described in more detail in
          <xref ref-type="bibr" rid="ref11 ref19">(Hyvo¨nen et al., 2017)</xref>
          , and
the underlying data model is presented in
          <xref ref-type="bibr" rid="ref11 ref18 ref19">(Leskinen et al.,
2017)</xref>
          .
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Semantic National Biography of Finland</title>
        <p>
          Bio CRM has also been used in creating the first prototype
demonstrator of the Semantic National Biography of
Finland
          <xref ref-type="bibr" rid="ref12 ref19">(Hyvo¨nen et al., 2018)</xref>
          , based on a collection of some
13 000 short biographies that were transformed into
RDFformat and enriched using data linking to external datasets.
Application of Bio CRM to prosopographical research in
the Norssi and Semantic National Biography case studies
is described in more detail in
          <xref ref-type="bibr" rid="ref12 ref19">(Leskinen et al., 2018)</xref>
          .
        </p>
        <p>5The application is available at http://www.norssit.
fi/semweb/. Its linked open data is published at the Linked
Data Finland service http://ldf.fi at the SPARQL endpoint
http://ldf.fi/norssit/sparql.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion, Related Work, and Future Research</title>
      <p>Bio CRM is to the best of our knowledge the first attempt
to extend CIDOC CRM into the domain of biography and
prosopography with additional subclasses, properties, and
modeling guidelines. A major benefit of the model is the
compatibility with cultural heritage data from museums,
libraries, and archives represented using the same standard
framework ontology.</p>
      <p>From a modeling perspective, this paper presented the idea
of making a distinction among attributes, relations, and
events, where entities participate in different roles in a
qualified manner, not as entities themselves. The underlying
rationale for this is to harmonize the knowledge
representation with fewer categories and at the same time keep the
model expressive and easy to use by SPARQL queries. If
needed, the model can be extended with transformational
rules, by which more expressive and foundational event
structures can be transformed into simpler attribute and role
structures, when needed and appropriate from an
application perspective, and vice versa. Events are needed for
harmonizing data for the machine but the human end user
often conceptualizes the world in document-centric or other
ways. Thus, additional representations of the same
eventbased knowledge may be useful, especially if it is generated
automatically by the system.</p>
      <p>Our experiences of applying a first version of Bio CRM to
the three case studies suggest the the model is usable for
practical purposes. Though formal evaluation of the model
has not been conducted, the application of Bio CRM to data
originating from different sources, in different formats, and
covering different eras is an indication of the suitability of
the model to act as a general harmonizing model for
prosopographical data. In the spirit of design science
methodology, Bio CRM was designed to solve the modeling needs
of the prosopographical data of the Early Modern Letters
Online (EMLO), which provides a rich classification of the
events and associated roles of people in the early modern
era. In the cases of high school ”Norssi” alumni and
Semantic National Biography, Bio CRM is used to model,
e.g., the family relations and titles of the people (e.g.,
education, occupation). The idea of using semantically defined
linked data for modeling and aggregating biographical and
related data seems to be a promising approach for
biography and prosopography. However, more work is needed in
detailing out more precisely the class and property
structures of the model in the general case—so far new classes
and properties have been introduced based on the particular
use cases and the general modeling principles presented in
this paper.</p>
      <p>
        Biographical data has been studied by genealogists (e.g.,
(Event) GEDCOM6), CH organizations (e.g., the Getty
ULAN7), and semantic web researchers (e.g., BIO
ontology8). Semantic web event models include, e.g., Event
Ontology
        <xref ref-type="bibr" rid="ref22 ref31 ref9">(Raimond and Abdallah, 2007)</xref>
        , LODE
ontology
        <xref ref-type="bibr" rid="ref27">(Shaw, 2010)</xref>
        , SEM
        <xref ref-type="bibr" rid="ref30">(van Hage et al., 2011)</xref>
        , and
EventModel-F9
        <xref ref-type="bibr" rid="ref26">(Scherp et al., 2009)</xref>
        . Also, Bibliographic
Ontology (BIBO)
        <xref ref-type="bibr" rid="ref2 ref26">(D’Arcus and Giasson, 2009)</xref>
        includes a
model for events. For a more detailed comparison on event
models, see
        <xref ref-type="bibr" rid="ref25">(Scherp and Mezaris, 2014)</xref>
        . A history
ontology with map visualizations is presented in
        <xref ref-type="bibr" rid="ref21">(Nagypal et al.,
2005)</xref>
        , and an ontology of historical events in
        <xref ref-type="bibr" rid="ref9">(Hyvo¨nen
et al., 2007)</xref>
        . Visualization using historical timelines is
discussed, e.g., in
        <xref ref-type="bibr" rid="ref13">(Jensen, 2003)</xref>
        , and event extraction
reviewed in
        <xref ref-type="bibr" rid="ref8">(Hogenboom et al., 2011)</xref>
        .
      </p>
      <p>
        PROSO
        <xref ref-type="bibr" rid="ref32">(Zingoni, 2014)</xref>
        is a data model for presenting
prosopographical data records. It has a strong focus on
representing the provenance information of the records
using factoids, and uses event-based modeling for stating
the changes of a person (e.g., receiving a honorary
ti6http://en.wikipedia.org/wiki/GEDCOM
7http://www.getty.edu/research/tools/
vocabularies/ulan/
8http://vocab.org/bio/
9https://www.kd.informatik.uni-kiel.de/
en/research/ontologies/core-ontologies
tle). Vocabularies and ontologies for representing personal
relationships include the Standards for Networking
Ancient Prosopographies (SNAP) project10 and the
Relationship vocabulary
        <xref ref-type="bibr" rid="ref3">(Davis, 2004)</xref>
        . Existing data models that
support role representation include CIDOC CRM, SEM,
VIVO/BFO, Schema.org, PROV, and the Organization
Ontology.
      </p>
      <p>CIDOC CRM includes a mechanism for representing the
role of an active participant in an event, modeling it as a
property of the property that is used for representing the
participating actor (see CIDOC’s P14.1 in the role
of). There is a proposal for encoding CIDOC’s properties
of properties as RDF11, introducing new class for the
property and auxiliary properties for connecting the event and
the participant, which adds complexity to the data model.
The status of the proposal is still experimental.</p>
      <p>Simple Event Model (SEM) is a general model for
expressing events, with support for three alternative representations
for roles, based on using a) rdf:value, b) reification, or
c) named graphs. All of the techniques add complexity to
the data model. The model also introduces a new property
sem:roleType, instead of using the standard RDF
property rdf:type.</p>
      <p>
        VIVO Integrated Semantic Framework ontology modules
(VIVO-ISF)12 include a model for representing roles in
events. The model uses the properties of the Basic
Formal Ontology (BFO)
        <xref ref-type="bibr" rid="ref28">(Smith et al., 2015)</xref>
        and the
related Relations Ontology (RO)13, inheres in to
attach the role to a person, and realized in to
attach the role to an event (and their inverse properties
bearer of and realizes). It should be noted that
the inclusion of the properties inheres in/bearer of
and realizes/is realized in in BFO is unclear in
the current version (BFO 2.0)14. Bio CRM has taken
some inspiration from VIVO, and uses similar properties
bioc:inheres in and bioc:bearer of, but retains
compatibility with CIDOC CRM by using the property
cidoc:P11 had participant to attach the role to
the event.
      </p>
      <p>Schema.org15 provides a class schema:Role to
represent additional information of a relation or property.
It can be used to model the roles of participants in
events. The instance of the schema:Role acts as
an intermediary node between the event and participant,
both of them attached with the ”original” property, e.g.,
cidoc:P11 had participant. The model’s strength
is its simplicity, but the re-use of a property in such a way
might be unintuitive. The model also introduces a new
property schema:additionalType, instead of using
the standard RDF property rdf:type.</p>
      <p>10https://snapdrgn.net
11http://www.cidoc-crm.org/
roles-in-the-cidoc\%E2\%80\
%90crm-modelling-properties-of-properties
12https://wiki.duraspace.org/display/
VIVODOC19x/Ontology+Reference
13https://github.com/oborel/obo-relations/
14https://github.com/BFO-ontology/BFO/
blob/BFO2.0/README.md
15http://schema.org</p>
      <p>W3C’s PROV model uses qualified associations to
specify the roles of the participants in events. The
instance of the class prov:Association is attached
to the role using the property prov:hadRole, to the
agent using the property prov:agent, and to the event
using the property prov:qualifiedAssociation.
Thus, the standard CIDOC CRM event can be qualified
with such an association, but it might be unintuitive for
the user to represent the qualifier separately from the
cidoc:P11 had participant relation. By design,
PROV is meant for representing provenance information
involved in producing a piece of data or thing; all
biographical events are not such activities.</p>
      <p>
        W3C’s Organization Ontology
        <xref ref-type="bibr" rid="ref23">(Reynolds, 2014)</xref>
        is a
core ontology for describing organizational structures.
It includes a model for specifying membership roles
people have in organizations by introducing the class
org:Membership. Such a membership is associated
with the role using the property org:role, to the agent
with the property org:member, and to the organization
with the property org:organization. The modeling
approach is similar to the W3C’s PROV model, but using
different property names.
      </p>
      <p>
        Bio CRM’s property bioc:inheres in is used for
representing both atemporal (unary roles and binary
relationships without qualifiers) and temporal (qualified by using
events) roles of people. This is an informed decision for
the simplicity of the model. A different approach has been
chosen in Basic Formal Ontology (BFO) 2.0, where
relations can be represented as continuous or occurrent, with
separate relation types for them
        <xref ref-type="bibr" rid="ref28">(Smith et al., 2015)</xref>
        . BFO’s
approach has been criticized for its complexity, that causes
logic and usability issues
        <xref ref-type="bibr" rid="ref20">(Mungall, 2013)</xref>
        .
      </p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>
        The development of Bio CRM was started in the EU COST
project ”Reassembling the Republic of Letters”16
        <xref ref-type="bibr" rid="ref29">(Tuominen et al., 2018)</xref>
        .
      </p>
    </sec>
    <sec id="sec-7">
      <title>6. References</title>
      <p>16http://www.republicofletters.net</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Aba</given-names>
            <surname>Sah</surname>
          </string-name>
          Dadzie and
          <string-name>
            <given-names>Matthew</given-names>
            <surname>Rowe</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Approaches to visualising Linked Data: A survey</article-title>
          .
          <source>Semantic Web</source>
          ,
          <volume>2</volume>
          (
          <issue>2</issue>
          ):
          <fpage>89</fpage>
          -
          <lpage>124</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Bruce D'Arcus</surname>
          </string-name>
          and Fre´de´rick Giasson.
          <year>2009</year>
          .
          <article-title>Bibliographic Ontology specification</article-title>
          . http: //bibliontology.com.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Ian</given-names>
            <surname>Davis</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Relationship: A vocabulary for describing relationships between people</article-title>
          . http://vocab. org/relationship/.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Martin</given-names>
            <surname>Doerr</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Ontologies for cultural heritage</article-title>
          . In S. Staab and R. Studer, editors,
          <source>Handbook on ontologies (2nd Edition)</source>
          , pages
          <fpage>463</fpage>
          -
          <lpage>486</lpage>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>David</given-names>
            <surname>Easley</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jon</given-names>
            <surname>Kleinberg</surname>
          </string-name>
          .
          <year>2010</year>
          . Networks, Crowds, and
          <article-title>Markets: Reasoning about a Highly Connected World</article-title>
          . Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Nicola</given-names>
            <surname>Guarino</surname>
          </string-name>
          and
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Welty</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Evaluating ontological decisions with OntoClean</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>45</volume>
          (
          <issue>2</issue>
          ):
          <fpage>61</fpage>
          -
          <lpage>65</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Robert A.</given-names>
            <surname>Hanneman</surname>
          </string-name>
          and
          <string-name>
            <given-names>Mark</given-names>
            <surname>Riddle</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Introduction to social network methods</article-title>
          . University of California, Riverside, CA.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Frederik</given-names>
            <surname>Hogenboom</surname>
          </string-name>
          , Flavius Frasincar, Uzay Kaymak, and Franciska de Jong.
          <year>2011</year>
          .
          <article-title>An overview of event extraction from text</article-title>
          .
          <source>In DeRiVE</source>
          <year>2011</year>
          , Detection, Representation, and
          <article-title>Exploitation of Events in the Semantic Web at the 10th</article-title>
          <source>International Semantic Web Conference 2011 (ISWC</source>
          <year>2011</year>
          ), Bonn, Germany, October.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Eero</given-names>
            <surname>Hyvo</surname>
          </string-name>
          ¨nen, Olli Alm, and
          <string-name>
            <given-names>Heini</given-names>
            <surname>Kuittinen</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Using an ontology of historical events in semantic portals for cultural heritage</article-title>
          .
          <source>In Proceedings of the Cultural Heritage on the Semantic Web Workshop at the 6th International Semantic Web Conference (ISWC</source>
          <year>2007</year>
          ), Busan, Korea, November.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Eero</given-names>
            <surname>Hyvo</surname>
          </string-name>
          ¨nen.
          <year>2012</year>
          .
          <article-title>Publishing and using cultural heritage linked data on the semantic web</article-title>
          . Morgan &amp; Claypool, Palo Alto, CA.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Eero</given-names>
            <surname>Hyvo</surname>
          </string-name>
          ¨nen, Petri Leskinen, Erkki Heino, Jouni Tuominen, and
          <string-name>
            <given-names>Laura</given-names>
            <surname>Sirola</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Reassembling and enriching the life stories in printed biographical registers: Norssi high school alumni on the semantic web</article-title>
          .
          <source>In Proceedings, Language, Technology and Knowledge (LDK</source>
          <year>2017</year>
          ), pages
          <fpage>113</fpage>
          -
          <lpage>119</lpage>
          , Galway, Ireland, June.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Eero</given-names>
            <surname>Hyvo</surname>
          </string-name>
          ¨nen, Petri Leskinen, Minna Tamper, Jouni Tuominen, and
          <string-name>
            <given-names>Kirsi</given-names>
            <surname>Keravuori</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Semantic national biography of Finland</article-title>
          .
          <source>In Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN</source>
          <year>2018</year>
          ), Helsinki, Finland, March.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Matt</given-names>
            <surname>Jensen</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Vizualising complex semantic timelines</article-title>
          .
          <source>NewsBlip Research Papers, Report NBTR2003- 001</source>
          . http://www.newsblip.com/tr/.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Katharine S. B. Keats-Rohan</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Biography, identity and names: Understanding the pursuit of the individual in prosopography</article-title>
          .
          <source>In Prosopography Approaches and Applications. A Handbook</source>
          , pages
          <fpage>139</fpage>
          -
          <lpage>182</lpage>
          . University of Oxford.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Johannes</given-names>
            <surname>Kehrer</surname>
          </string-name>
          and
          <string-name>
            <given-names>Helwig</given-names>
            <surname>Hauser</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Visualization and visual analysis of multifaceted scientific data: A survey</article-title>
          .
          <source>IEEE transactions on visualization and computer graphics</source>
          ,
          <volume>19</volume>
          (
          <issue>3</issue>
          ):
          <fpage>495</fpage>
          -
          <lpage>513</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Kouji</given-names>
            <surname>Kozaki</surname>
          </string-name>
          , Eiichi Sunagawa, Yoshinobu Kitamura, and
          <string-name>
            <given-names>Riichiro</given-names>
            <surname>Mizoguchi</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Fundamental consideration of role concepts for ontology evaluation</article-title>
          .
          <source>In Proceedings of EON2006</source>
          , Edinburgh, United Kingdom.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Timothy</given-names>
            <surname>Lebo</surname>
          </string-name>
          , Satya Sahoo, and
          <string-name>
            <surname>Deborah McGuinness</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>PROV-O: The PROV Ontology</article-title>
          .
          <source>W3C Recommendation 30 April</source>
          <year>2013</year>
          , http://www.w3.org/ TR/2013/REC-prov-o-
          <volume>20130430</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Petri</given-names>
            <surname>Leskinen</surname>
          </string-name>
          , Jouni Tuominen, Erkki Heino, and Eero Hyvo¨nen.
          <year>2017</year>
          .
          <article-title>An ontology and data infrastructure for publishing and using biographical linked data</article-title>
          .
          <source>In Proceedings of the Workshop on Humanities in the Semantic Web (WHiSe II)</source>
          , Vienna, Austria, October.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Petri</given-names>
            <surname>Leskinen</surname>
          </string-name>
          , Eero Hyvo¨nen, and
          <string-name>
            <given-names>Jouni</given-names>
            <surname>Tuominen</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Analyzing and visualizing prosopographical linked data based on short biographies</article-title>
          .
          <source>In Proceedings of Biographical Data in a Digital World</source>
          <year>2017</year>
          (
          <year>BD2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Christopher J.</given-names>
            <surname>Mungall</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>A critique of temporalized relations</article-title>
          .
          <source>Technical report. v1.3. April</source>
          <volume>25</volume>
          , https://github.com/cmungall/ trel-crit/raw/master/trc.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Gabor</surname>
            <given-names>Nagypal</given-names>
          </string-name>
          , Richard Deswarte, and
          <string-name>
            <given-names>Jan</given-names>
            <surname>Oosthoek</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Applying the semantic web: The VICODI experience in creating visual contextualization for history</article-title>
          .
          <source>Lit Linguist Computing</source>
          ,
          <volume>20</volume>
          (
          <issue>3</issue>
          ):
          <fpage>327</fpage>
          -
          <lpage>349</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <given-names>Yves</given-names>
            <surname>Raimond</surname>
          </string-name>
          and
          <string-name>
            <given-names>Samer</given-names>
            <surname>Abdallah</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>The Event Ontology</article-title>
          . http://motools.sourceforge.net/ event/event.html.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <given-names>Dave</given-names>
            <surname>Reynolds</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>The Organization Ontology</article-title>
          .
          <source>W3C Recommendation 16 January</source>
          <year>2014</year>
          , https://www.w3.org/TR/2014/ REC-vocab-org-
          <volume>20140116</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Brian</given-names>
            <surname>Roberts</surname>
          </string-name>
          .
          <year>2002</year>
          . Biographical Research. Understanding social research. Open University Press.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <given-names>Ansgar</given-names>
            <surname>Scherp</surname>
          </string-name>
          and
          <string-name>
            <given-names>Vasileios</given-names>
            <surname>Mezaris</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Survey on modeling and indexing events in multimedia</article-title>
          .
          <source>Multimedia Tools and Applications</source>
          ,
          <volume>70</volume>
          (
          <issue>1</issue>
          ):
          <fpage>7</fpage>
          -
          <lpage>23</lpage>
          , May.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>Ansgar</given-names>
            <surname>Scherp</surname>
          </string-name>
          , Thomas Franz, Carsten Saathoff, and
          <string-name>
            <given-names>Steffen</given-names>
            <surname>Staab</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>F-a model of events based on the foundational ontology DOLCE+DnS Ultralight</article-title>
          .
          <source>In Proceedings of the Fifth International Conference on Knowledge Capture (K-CAP '09)</source>
          , pages
          <fpage>137</fpage>
          -
          <lpage>144</lpage>
          , Redondo Beach, California, USA.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <given-names>Ryan</given-names>
            <surname>Shaw</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>LODE: An ontology for linking open descriptions of events</article-title>
          . http://linkedevents. org/ontology/.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <given-names>Barry</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Mauricio</given-names>
            <surname>Almeida</surname>
          </string-name>
          , Jonathan Bona, Mathias Brochhausen, Werner Ceusters, Melanie Courtot, Randall Dipert,
          <string-name>
            <surname>Albert Goldfain</surname>
          </string-name>
          , Pierre Grenon, Janna Hastings,
          <string-name>
            <given-names>William</given-names>
            <surname>Hogan</surname>
          </string-name>
          , Leonard Jacuzzo, Ingvar Johansson, Chris Mungall, Darren Natale, Fabian Neuhaus, James Overton, Anthony Petosa, Robert Rovetto, Alan Ruttenberg, Mark Ressler, Ron Rudniki, Selja Seppa¨la¨,
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Schulz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Jie</given-names>
            <surname>Zheng</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Basic Formal Ontology 2.0 - specification and user's guide</article-title>
          .
          <source>Technical report. June</source>
          <volume>26</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <given-names>Jouni</given-names>
            <surname>Tuominen</surname>
          </string-name>
          , Eetu Ma¨kela¨, Eero Hyvo¨nen, Arno Bosse,
          <string-name>
            <given-names>Miranda</given-names>
            <surname>Lewis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Howard</given-names>
            <surname>Hotson</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Reassembling the republic of letters - a linked data approach</article-title>
          .
          <source>In Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN</source>
          <year>2018</year>
          ), Helsinki, Finland, March.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <surname>Willem Robert van Hage</surname>
          </string-name>
          , Ve´ronique Malaise´,
          <string-name>
            <surname>Roxane</surname>
            <given-names>Segers</given-names>
          </string-name>
          , Laura Hollink, and
          <string-name>
            <given-names>Guus</given-names>
            <surname>Schreiber</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Design and use of the simple event model (SEM)</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          ,
          <volume>9</volume>
          (
          <issue>2</issue>
          ):
          <fpage>128</fpage>
          -
          <lpage>136</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <given-names>Koenraad</given-names>
            <surname>Verboven</surname>
          </string-name>
          , Myriam Carlier, and
          <string-name>
            <given-names>Jan</given-names>
            <surname>Dumolyn</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>A short manual to the art of prosopography</article-title>
          .
          <source>In Prosopography Approaches and Applications. A Handbook</source>
          , pages
          <fpage>35</fpage>
          -
          <lpage>70</lpage>
          . University of Oxford.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <given-names>Jacopo</given-names>
            <surname>Zingoni</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>PROSO data model - a solution for modelling historical academic prosopographical records as linked data through an event based ontological approach</article-title>
          . In Atelier Helo¨ıse Workshop, Lyon, France, March.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>