<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A practical tool for creating, managing and sharing evolving linked data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Graham Klyne</string-name>
          <email>graham.klyne@oerc.ox.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cerys Willoughby</string-name>
          <email>Cerys.Willoughby@soton.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kevin Page</string-name>
          <email>kevin.page@oerc.ox.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Chemistry, University of Southampton</institution>
          ,
          <addr-line>Highfield, Southampton, SO17 1BJ</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Oxford e-Research Centre, University of Oxford</institution>
          ,
          <addr-line>7 Keble Rd, Oxford, OX1 3QG</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Annalist is a software system for individuals and small groups to reap the bene ts of using RDF linked data, supporting them to easily create data that participates in a wider web of linked data. It presents a exible web interface for creating, editing and browsing evolvable data, without requiring the user to be familiar with minutiae of the RDF model or syntax, or to perform any programming, HTML coding or prior con guration. Development of Annalist was motivated by data capture and sharing concerns in a small bioinformatics research group, and customized personal information management. Requirements centre particularly on achieving low activation energy for simple tasks, exibility to add structural details as data is collected, access-controlled sharing, and ability to connect private data with public data on the web. It is designed as a web server application, presenting an interface for de ning data structure and managing data. Data is stored as text les that are amenable to access by existing software, with the intent that a range of applications may be used in concert to gather, manage and publish data. During its development, Annalist has been used in a range of applications, which have informed decisions about its design and proven its exibility and robustness in use. It has been particularly e ective in exploring and rapid prototyping designs for linked data on the web, covering science and humanities research, creative art and personal information.</p>
      </abstract>
      <kwd-group>
        <kwd>Semantic Web</kwd>
        <kwd>Linked data</kwd>
        <kwd>Data management</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        In a blog post based on an 2013 ESWC keynote
presentation, Karger[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] argues that a primary feature of Semantic
Web applications should be to accommodate evolving data:
Corresponding author
\A Semantic Web application is one whose schema is
expected to change". He also argues: \The current state of
tools for end users to capture, communicate, and manage
their information is terrible".
      </p>
      <p>
        Annalist (\keeper of records") is a linked data notebook, a
software system for creating, editing and managing RDF[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
linked data, which attempts to address some of the problems
noted by Karger, by allowing the structure of stored data to
evolve with understanding of requirements and the nature of
the available data. Its primary aim is to enable individual
users and small groups to reap the bene ts of creating and
using linked data; i.e. to create data that can be shared,
evolved and re-mixed with other data on the web.
      </p>
      <p>Annalist is application-agnostic, but has been developed to
address data management for small research groups lacking
capacity for web site development. It aims to be:
easy-touse, without programming; exible, allowing structure to be
crystallised around available data; sharable, facilitating
collaboration with local and remote colleagues; and re-mixable,
for combining locally created data with community resources.
Annalist also \scratches an itch" as a tool for web-based
personal information management and sharing.</p>
      <p>While supporting contribution to linked data at web scale,
Annalist's design assumes that individual datasets t
comfortably in the available RAM and local le system of a
modern personal computer. It is not a general-purpose RDF
editor, but approaches data from a perspective rooted in
application concepts rather as an RDF graph, and not all
RDF structures can be generated or directly managed (this
does not preclude linking to arbitrary RDF data). Finally, it
is not a general publishing platform: the presentation of data
is oriented towards data management actives, and assume
the user is familiar with the content.</p>
      <p>Annalist is an open development1, with source code, design
notes and documentation kept in a public Github repository2.
There is also a public demonstrator and tutorial3.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>MOTIVATION AND REQUIREMENTS</title>
      <p>
        One motivating use case for developing Annalist was
FlyTED, a database of Drosophila Testis Gene Expression
images[
        <xref ref-type="bibr" rid="ref39">39</xref>
        ][
        <xref ref-type="bibr" rid="ref41">41</xref>
        ]. Our experiences with FlyTED (summarized in
gure 1) give rise to requirements identi ed in parentheses,
and summarized below.
1http://oss-watch.ac.uk/resources/odm
2https://github.com/gklyne/annalist
3http://annalist.net/
2 Evaluate OME
3 BioImage discontinued
4 ePrints adopted
5 FlyTED database public
      </p>
      <p>NAR paper 6</p>
      <p>Live database lost in system failure 7
BioImage Database</p>
      <p>FlyTED project
Approximate timeline (year)
2004
2007
2010</p>
      <p>
        The FlyTED database was originally intended to be
published using the BioImage Database[
        <xref ref-type="bibr" rid="ref31 ref7">31, 7</xref>
        ], which was an
early implementation of a database incorporating metadata
based on semantic web standards. Limitations of early RDF
and web software tool sets imposed design compromises that
eventually led to the BioImage Database software not
being sustained. Also, an early version of Open Microscopy
Environment (OME)[
        <xref ref-type="bibr" rid="ref35">35</xref>
        ] was evaluated, but its metadata
schema was found to be insu ciently exible to
accommodate the range of annotation information that was required
(R4, R5). Eventually, a modi ed version of ePrints
repository software[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] was used to publish the gene expression
images and associated annotations. Some time after the data
were published, a combination of a virtual storage service
failure and a backup system con guration error resulted in
the running system being lost. Although the original data
remained available, the live data was stored in a relational
database le; loss of the publication platform, and lack of
available resources to re-apply the ePrints customizations
and rebuild the database meant the database was not
reinstated (R8, R9, R10). The loss might have been mitigated
if live data had been more easily shared, including via version
management systems (R6).
      </p>
      <p>
        Although not large, the FlyTED data were expensive to
gather, hence valuable, with each combination of gene and
phenotype requiring literature and database searches,
statistical analysis, laboratory procedures for sample preparation,
microscopic image capture and annotation by a biological
expert. Initial input of image annotations used spreadsheets,
as the biologists were familiar with these (R1, R3). During
the course of the investigations documented by FlyTED,
the terms used to record developmental stages during which
genes were active were adjusted to provide better coverage
of the observations (R5). Programmatic access to the
underlying data was subsequently used to create an exemplar
application, OpenFlyData[
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], which provided facilities for
search and co-display of gene information across a number
of Drosophila databases (R11).
      </p>
      <p>
        The FlyTED database was created by a small team of
software developers working closely with biologists, but the
original intent was to pave the way for tools that biologists
could use without developer support (R1, R2, R3).
Without the support of developers, the biologists would almost
certainly have not gone beyond creating spreadsheets
containing their observations, the contents of which cannot easily
be cross-referenced with external data sources without prior
knowledge of their structure (e.g. which column is used for
the FlyBase gene ID?). In creating the published database,
observed genes were cross-referenced with FlyBase[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], the
community database of information about Drosophila genes,
and annotated using terms compliant with the MISFISHIE
standard[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] (R4, R7).
      </p>
      <p>Finally, and not directly related to the FlyTED experience,
we also wanted Annalist to be suitable for o ine use (e.g. for
eld work) (R12).</p>
      <sec id="sec-2-1">
        <title>Summary of requirements</title>
        <p>R1 Ease of use: possible to quickly create a simple collection
and start capturing data.</p>
        <p>R2 Ease of use: no programming or HTML coding needed
to create a new collection.</p>
        <p>R3 Ease of use: detailed knowledge of RDF and/or OWL
not needed to create or edit data.</p>
        <p>R4 Flexibility: choice of RDF vocabulary used in the data.
R5 Flexibility: possible to de ne or adapt structure of data
as it is collected.</p>
        <p>R6 Sharability: data can be shared between collaborators
using a variety of techniques, including online access
and o ine le copying.</p>
        <p>R7 Remixability: use of domain vocabularies or ontologies
to facilitate combining with community datasets; ability
to present data as linkable web resources, and to link
to external web resources.</p>
        <p>R8 Portability: possible to move data between live systems;
not dependent on a single central service.</p>
        <p>R9 Sustainability of software: data capture, editing and
browsing possible using unmodi ed software.</p>
        <p>R10 Sustainability of data: underlying data stored and
exposed using a standard, easily used data format.
R11 Visible data: underlying data exposed so that functions
not provided by the system (e.g. data visualization)
can be implemented by independent software.</p>
        <p>R12 O ine working: deployable on a personal computer,
allowing work on linked data collections without an
Internet connection.
3.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>RELATED WORK</title>
      <p>As a system for creating and managing data on the web,
Annalist enters a crowded space with a wealth of alternatives
available. But, despite this, we are unaware of anything that
provides the out-of-box capability of Annalist for creating
linked data, and meeting the requirements outlined. Figure
2 gives an overview of systems with respect to the
requirements listed above. Our survey focuses on tools that directly
present end-user data management interfaces. There are
systems (e.g. Virtuoso, Sesame, Jena, WikiBase, etc.) that
are primarily semantic data stores and developer tools that
are not covered. Also, tools for data cleaning (e.g.
OpenRe ne, LODRe ne) or middleware that augments existing
data (e.g. Poolparty, Sponger) are considered complementary
rather than alternatives to Annalist.
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>Semantic Web Systems and Tools</title>
      <p>
        Callimachus4 provides exible support for sharable linked
data, but is a tool for developers rather than end-users.
Semantic media wiki5[
        <xref ref-type="bibr" rid="ref37">37</xref>
        ] is usable without additional
development, but is not really suited to desktop deployment, and
4http://callimachusproject.org
5http://semantic-mediawiki.org
✔ ✧ ✔ ✔ ✔ ✔ ✔ ✔ ✔
      </p>
      <p>✔ ✧ ?
✔ ✔ ✔ ✧
✔ ✔
✔ ✔ ✔
✔ ✔ ✔
✔ ✔ ✔ ✔ ✔ ✧</p>
      <p>✔ ✔ ✧ ✔ ✔
✔ ✔ ✧ ✔ ✧ ✧ ✧ ✔ ✔ ✧ ✔ ✔
✧</p>
      <p>
        ✔ ✧
✔ ✔ ✔ ✧
✔ ✔ ✔ ✧ ✧ ✧
✔
✔
✧
✔
✔
✧
✔
✔
✔ ✔
✧
✧
✔
✧
✧
✔
✔
Semantic MediaWiki ✧ ✔ ✔ ✔ ✧
✧ ? ✔
the data is not amenable to version management or sharing
via le sharing. Rauschmayer presented a poster[
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] about
the Hyena RDF editor at SemWiki 2008, which appears to
envisage similar usage scenarios as Annalist, but does not
currently appear to be available as a usable tool. This work
is described in Rauschmayer's PhD thesis[
        <xref ref-type="bibr" rid="ref28">28</xref>
        ], which states
that the core idea is to use a central repository for linked
information, where Annalist is conceived as being just a small
part in a wider linked data infrastructure. Wikidata6, built
upon the Wikibase7 data store, acts as central storage for
structured data of Wikimedia projects. It has similarities
to Annalist - \items" and \statements" parallel Annalist's
Entities and Fields, but the user interface is not
customizable and it does not appear to support the creation of data
collections independently of Wikipedia and related projects.
      </p>
      <p>
        There are ontology design tools, such as Protege[
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]
(including WebProtege), which can be used to create RDF data,
but a focus on ontology design leads to a complex interface
that is not well suited for end-user creation and management
of linked data. Changing the data structure requires an
understanding of ontology design.
      </p>
      <p>
        Piggy Bank[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] was developed as a tool for consuming web
data, and creating a local RDF store to facilitate navigation
and merging data from diverse sources. The emphasis here
was on consuming web data from heterogeneous sources
(something that Annalist can facilitate), but not so much on
creating linked data for sharing and eventual publication.
      </p>
      <p>
        Fresnel[
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] is an RDF vocabulary for controlling
presentation of RDF, for which Annalist uses home-spun terms. The
development of Annalist focused initially on creating a user
interface to create linked data without knowledge of HTML
or RDF (requirements R2, R3), and the vocabulary needed
to describe the presentation emerged from this approach.
With the technical requirements now established in running
code, evaluating a retro- t of Fresnel could be a topic for
further work. RDForms8 (\RDF Forms") is a JavaScript
6https://www.wikidata.org/
7urlhttp://wikiba.se
8http://rdforms.org/
library supporting a declarative description of views for
editing and presenting RDF, whose interface appears to have
some aspects in common with Annalist. But RDForms is a
developer tool, and not something that can be used without
programming.
      </p>
      <p>
        One use for Annalist has been to create additional
information, or annotations, related to web pages (e.g., see
section 5.5 below). Pundit9 is positioned as a semantic web
annotation tool for research, capable of performing faceted
search over annotated web pages10. It appears to be able
to create linked data annotations, but it is not clear if it
can create free-standing linked datasets, or how easily the
annotations created can be exported and/or consumed by
other applications. Another annotation tool is Domeo11[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]:
this, too, can create RDF annotations of online documents,
but does not appear to create free-standing data.
3.2
      </p>
    </sec>
    <sec id="sec-5">
      <title>Data sharing platforms</title>
      <p>
        Figshare12 is a a proprietary web platform for research
data sharing that is well-suited for sharing research papers,
supporting data and other materials, but does not of itself
provide support for creating re-mixable linked data.
ResearchSpace13 is developing \a collaborative environment for
humanities and cultural heritage research using knowledge
representation and Semantic Web technologies", sharing some
goals with Annalist, but specialized for cultural heritage by
building speci cally on CIDOC CRM[
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
      </p>
      <p>
        The Database Wiki[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is another project to provide a
generic, collaborative, user-friendly interface over structured
data. In this case, the underlying data is XML rather than
RDF, and there is less emphasis on linking with external data.
This project is informed by Form Lenses[
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], a principled
approach to mapping between stored data and a presented
user interface, based on Applicative Functors[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], which o er
a possible avenue for future work with Annalist.
      </p>
      <p>
        Histcross14[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] was a semantic database of historical data,
subsequently replaced by Segrada15, apparently with many
similar goals to Annalist, but does not work with linked data
so would not readily participate in a wider network of data.
3.3
      </p>
    </sec>
    <sec id="sec-6">
      <title>Spreadsheets and desktop databases</title>
      <p>
        Regular spreadsheets (Excel, Open O ce, etc.) are very
popular for research and personal information management,
o ering exibility, ease of use and sharing (e.g. via CSV), but
do not easily support combining data from di erent sources,
do not provide direct web access to the underlying data, and
are not particularly well suited for use with version control
systems. Right eld[
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] is a tool that augments spreadsheets
to facilitate entry of semantically constrained terms, and as
such goes some way to addressing the remixing problems of
spreadsheet data, but does not really lend itself to creating
multiple cross-linked linked data structures, and shares other
limitations of spreadsheets.
      </p>
      <p>
        Bakke[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] reports an an experiment with Related
Worksheets that explores the management of multiple relations
between worksheets in a desktop application. Their paper
9http://thepund.it
10http://eswcdemo.gramsciproject.org
11http://swan.mindinformatics.org
12http:// gshare.com
13http://www.researchspace.org
14https://github.com/mkalus/histcross
15http://www.segrada.org
clearly explain some problems that Annalist aims to address,
and proceeds to evaluate how they can be addressed in a
spreadsheet interface. The work suggests user interface
designs that might be helpful, but does not of itself provide
usable tools for creating linked data.
      </p>
      <p>Desktop databases such as Access16 require some con
guration e ort before they can be used for capturing data,
which in turn are constrained by the relational schema used,
and not readily linked with external datasets.
3.4</p>
    </sec>
    <sec id="sec-7">
      <title>Content management systems</title>
      <p>Other classes of web application that might be considered
for research data management include Content Management
Systems (CMSs, such as Wordpress17 or Drupal18). These
require signi cant development and/or con guration e ort
to create a data sharing platform, and do not support the
full exibility of RDF linked data. Drupal has built-in RDF
support that is layered over an underlying schema, and is
not amenable to change without re-working the underlying
site con guration. Also, CMSs tend to hide the underlying
data from direct view or manipulation, rather than exposing
it for other applications to use in di erent ways.
3.5</p>
    </sec>
    <sec id="sec-8">
      <title>Electronic Laboratory Notebooks</title>
      <p>
        Annalist in some respects resembles electronic laboratory
notebook systems (ELNs). There are many proprietary ELNs
that are aimed at commercial research laboratories and as
such may be beyond the budget of an individual or small
research group. There are also some open source ELNs
(e.g. Voegele et al[
        <xref ref-type="bibr" rid="ref36">36</xref>
        ], elabftw19, LabTrove[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. We have not
speci cally evaluated any of these, but in general they o er
a blog-like platform, where textual notes may be augmented
with named attribute or tabular data. We are not aware of
any ELNs that support web linked data.
      </p>
    </sec>
    <sec id="sec-9">
      <title>DESIGN</title>
      <p>
        Annalist adopts a frame-oriented, or entity-oriented,
approach to presenting and storing data, rather than being RDF
graph based. Frames are considered to be easy for people to
understand as they model some aspects of human memory
patterns[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. schraefel and Karger[
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] explore in some depth
the presentation of semantic web data in user interactions,
and emphasize consideration of \what do we want to do"; in
the case of Annalist, what we want to do is create, manage
and share linked data on the web. We observe that when
researchers use spreadsheets to create data, it is commonly
arranged with a row of information for each of a number of
similar entities (e.g. microarray descriptions commonly use
a row of values for each of several thousand gene probes).
The frame-based approach has implementation advantages,
too: it provides a convenient grouping of data such that the
description of each entity is stored as a separate le, assigned
a URL, and directly accessed as a web resource.
      </p>
      <p>
        In discussions with researchers about their preference for
using spreadsheets, we were told that one of their reasons is
that spreadsheets do not impose a priori constraints on what
can be entered, making it easy for them to enter data as it
becomes available. Bakke et al [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] also note \When it comes
16https://products.o ce.com/en-us/access
17https://wordpress.org
18https://www.drupal.org
19http://elabftw.net/
to general editing tasks on tabular data, spreadsheet systems
have an advantage even over most tailor-made applications",
and advantages of a system that can \allow temporary
inconsistencies". Frey et al [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] also note that semantically aware
tools \could be too heavyweight and prescriptive", restricting
re-use in other areas. Annalist adopts a principle that its
rst task is to make it easy for researchers to capture their
data; de ning structure is secondary. Further, there is no
attempt to validate data entered, or impose any kind of
quality standards: we take a view that validity, quality and
re nement are dependent on a context of use[
        <xref ref-type="bibr" rid="ref40">40</xref>
        ], and as such
are usefully applied in such context.
      </p>
      <p>Linked data vocabulary terms are commonly associated
with schemas (or ontologies), but we observe that such terms
may be adopted independently of any schema. Annalist
supports evolving data initially though addition of terms,
even to the extent of taking an unstructured narrative,
identifying signi cant elements, and progressively articulating
them using new terms, preferably drawn from existing stable
schemas (e.g. sections 5.1 and 5.2). A related concern is
evolving schemas, which incur changes to terms already used
in data. Annalist provides support for supertype URIs in
type de nitions, and property aliases, which can assist with
type and property URI migration; generalization of these
features is ongoing. We have also performed migrations by
direct editing of the underlying data; while not an option for
non-technical users, it shows that schema evolution can also
be assisted by external services.</p>
      <p>
        The requirements for internal data to be exposed to third
party applications, and data portability, are addressed by
using JSON { speci cally JSON-LD[
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] { as the primary
internal data storage format. JSON-LD conventions allow
data to be interpreted as RDF, yet retaining the ease-of-use
of JSON, and use by applications that have no knowledge
of RDF. Within the data managed by Annalist, internal
links are stored as URI relative references[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], to be resolved
against the URL used to access the data, allowing data to
be copied from one Annalist deployment to another without
changes. Use of Compact URIs (CURIEs20) for eld names
and types, with pre xes de ned in external JSON-LD context
les, allows for more compact, readable and understandable
data compared with using full URIs.
      </p>
      <p>Browser
View
Model</p>
      <p>Django-based</p>
      <p>CSS, JS, PNG, etc.</p>
      <p>Controller Web application</p>
      <p>Static web
resources
Form rendering</p>
      <p>engine
Data model
access</p>
      <p>Field renderers</p>
      <p>Identity provider
(e.g. Google)
Web page
templates
One for al lists, one
for al entity views,
and one for al
entity edit forms.</p>
      <p>Page content is
determined by
control data from
the file system.</p>
      <p>Storage</p>
      <p>File system</p>
      <p>Control data: types, views, etc.</p>
      <p>User data</p>
      <p>Figure 3 shows the main components of the Annalist
software. It is implemented as an HTTP server application,
written in Python using the Django21 software framework.
All the essential Annalist application logic is implemented
by the server, but there is some limited use of Javascript
in the browser to provide a more responsive user interface;
the intent here is that Annalist can be used in browser
enviroments where Javascript is disabled or unavailable. The
Annalist server software is designed to be deployed locally
(on a personal computer), on a private network, or on a
publicly accessible host.</p>
      <p>At the heart of Annalist is a dynamic web-page creator
and form rendering engine that combines user data with a
form description to create an HTML web page. Figure 4 is
an example of data viewed using Annalist. The underlying
JSON-LD data can be accessed by web retrieval, either via
Annalist, or a suitably con gured web server; the le layout is
designed to preserve relative references. This helps to ensure
that access to the data is not dependent on the health of the
Annalist service. A goal is to allow user data to be stored on
the web, separately from the Annalist service itself, though
21https://www.djangoproject.com
there remain some access control details to be resolved to
make this a reality.</p>
      <p>Figure 5 shows an example of the Annalist data editing
interface. It actually shows a form used for editing the
de nition of a form description, so is self-referential: the
labels of the elds on the left of the page are echoed in the
list of eld descriptions at the bottom of the page.</p>
      <p>Figure 6 shows an excerpt from a listing of records. These
examples illustrate main kinds of display provided by
Annalist: a detailed view of a single record, which may be
an editing view or a view-only display, and a summary list
of multiple records. Further examples of the Annalist user
interface can be found in the Annalist tutorial document22.</p>
      <p>Site
Collection
Record
type
Data
record</p>
      <p>Record
view</p>
      <p>Record
list view</p>
      <p>Field
description</p>
      <p>Field group
(optional)</p>
      <p>Like the user data, form descriptions and other con
guration data are stored as JSON-LD les, and are editable
through the Annalist web interface. This makes Annalist
self-maintaining, in the sense that there is no separate con
guration interface or other mechanism needed to de ne data
types, storage structures, or their presentation.</p>
      <p>Data are organized as illustrated in gure 7. At the top
level is an Annalist Site, which is associated with an Internet
host (or localhost for desktop deployment). Site data is
grouped into free-standing Collections, which contain user
data, and metadata to de ne its structure and presentation.</p>
      <p>The data are stored in Data records, each of which is
presumed to describe some entity, and corresponds to an
addressable web resource or le (each having a distinct URL).
Record types correspond to the type of entity described,
and are used to combine similar entities for user
presentation (e.g. in lists), and also in the underlying data storage
(e.g. entities of di erent types are stored in di erent
directories or storage containers). Record views de ne forms used
22http://annalist.net/documents/tutorial/main
for creating, editing or viewing data records; Record lists
de ne presentation of multiple entities for browsing. Field
de nitions are referenced by record views and record lists,
and control the internal representation and presentation of
record component values. Field groups are used to group
elds for various purposes, e.g. to de ne repeated groups of
elds. URIs used for type and property URIs are contained in
the record type, eld and view de nitions, and may be drawn
from existing ontologies, or local ad/hoc identi ers with
potential to adopt existing vocabularies as correspondences are
determined.</p>
      <p>Access control is managed in two parts: authentication
is by a third-party identity provider (IDP) using OpenID
Connect23 returning an authenticated email address.
Annalist has been tested to date using the Google login service24.
Access control is handled by permissions stored as Annalist
records, which are de ned and applied on a per-collection
basis, with fall-back to site-wide permissions. Permissions
required for access may depend on record type (e.g. ADMIN
permission required to access the permission records), and
in future this might be used for ner-grained control.</p>
    </sec>
    <sec id="sec-10">
      <title>APPLICATIONS</title>
      <p>Annalist has been used with several personal and research
projects, described below, which have informed its ongoing
development. The rst example includes a sketch of its
implementation to provide insight into use of Annalist, and
all can be examined at the URLs given.
5.1</p>
    </sec>
    <sec id="sec-11">
      <title>The Carolan Guitar</title>
      <p>
        This is a project of Nottingham University's Mixed Reality
Laboratory on \Augmenting a Guitar with its Digital
Footprint"[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], recording its history online in the form of a blog25.
Annalist has been used to create a linked data overlay of
this history that links to the blog itself, and also to other
key resources that are part of its history26. This overlay
models events (construction, composition, performance and
others), people, places, artifacts, materials, musical
compositions and more using vocabularies drawn from RDFS[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ],
CIDOC CRM[
        <xref ref-type="bibr" rid="ref11 ref22">22, 11</xref>
        ], FRBRoo[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], and W3C PROV[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ].
5.1.1
      </p>
      <sec id="sec-11-1">
        <title>Carolan Guitar implementation</title>
        <p>Artifact
Tool</p>
        <p>Entity
Design
Material</p>
        <p>Work</p>
        <p>Construction Composition Performance</p>
        <p>Event
Place</p>
        <p>Person</p>
        <p>Role
23http://openid.net/connect/
24https://developers.google.com/identity/protocols/
OpenIDConnect
25http://carolanguitar.com
26http://demo.annalist.net/annalist/c/Carolan Guitar/d/
Artifact/Carolan Guitar/
elements of both PROV (prov:Entity, prov:Activity) and
CIDOC CRM (crm:E71_Man-Made_Thing, crm:E5_Event)
ontologies. Other, more re ned, types are introduced as judged
useful to capture the guitar's history. The Carolan
Guitar itself is an instance of Artifact, a subtype of Entity,
which is primarily an instance of the CIDOC CRM type
crm:E24_Physical_Man-Made_Thing, but is also associated
with a number of other types in the type de nition27:
{"annal:id": "Artifact",
"annal:type": "annal:Type",
"rdfs:label": "A constructed physical entity",
"rdfs:comment": "An artifact, such as a musical
instrument or some other object.",
"annal:uri": "crm:E24_Physical_Man-Made_Thing",
"annal:supertype_uris": [
{"annal:supertype_uri": "prov:Entity"},
{"annal:supertype_uri": "crm:E77_Persistent_Item"},
{"annal:supertype_uri": "crm:E70_Thing"},
{"annal:supertype_uri": "crm:E71_Man-Made_Thing"},
{"annal:supertype_uri": "crm:E18_Physical_Thing"},
{"annal:supertype_uri": "frbroo:F7_Object"}],
"annal:type_list": "_list/Artifacts",
"annal:type_view": "_view/Artifact"}</p>
        <p>The central subject, the Carolan Guitar, is presented
using the Artifact Record view28. This view describes an
Artifact with an identi er, type, label, description, links to
further information, and (central to this application) a list
of life events, which correspond to a journal of its history.</p>
        <p>Information about the Carolan Guitar and its life events
are recorded in its description29, e.g.:
:
{"crm:P12i_was_present_at":</p>
        <p>"Construction/Construction_9"},
{"crm:P12i_was_present_at":</p>
        <p>"Performance/First_performance"},
{"crm:P12i_was_present_at": "Performance/Stairway"},
{"crm:P12i_was_present_at": "Performance/Hop_jam"},
{"crm:P12i_was_present_at": "Event/Photo_shoot"},
{"crm:P12i_was_present_at":</p>
        <p>"Composition/Catch_the_moment"},
:</p>
        <p>Here, relative URL references are used to designate life
events, each of which may record information about where it
took place, entities used, and who was involved in what roles.
Di erent types of event in the guitar's history (construction,
composition, performance, etc.) may also have di erent
information: a construction event view30 may include information
about the tools, materials used; a performance view31 may
include details of the works performed.</p>
        <p>The modeling of the Carolan Guitar's story is by no means
complete (if such a thing is ever possible), and some choices
of what to include could reasonably be described as arbitrary.
But, using Annalist, the description can be augmented and
27http://demo.annalist.net/annalist/c/Carolan Guitar/d/
type/Artifact/type meta.jsonld
28http://demo.annalist.net/annalist/c/Carolan Guitar/d/
view/Artifact/view meta.jsonld
29http://demo.annalist.net/annalist/c/Carolan Guitar/d/
Artifact/Carolan Guitar/entity data.jsonld
30http://demo.annalist.net/annalist/c/Carolan Guitar/d/
view/Construction/view meta.jsonld
31http://demo.annalist.net/annalist/c/Carolan Guitar/d/
view/Performance/view meta.jsonld
re ned with additional types and more detailed view
descriptions as new requirements are encountered. Indeed, this has
already happened several times during its development.
5.2</p>
      </sec>
    </sec>
    <sec id="sec-12">
      <title>Smoke: creating an audio-visual poem</title>
      <p>
        Procedural Bending is presented in Garrelfs' PhD thesis[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
as a model for discourse about creative processes. It has
similarities with the W3C PROV model[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], but also some key
di erences. Annalist has been used to create a description
of the creation of \Smoke"32, an \experimental
documentary come audio-visual poem" about mid 20th-century air
pollution in the cities of Pittsburgh and St Louis. In this
case, a semi-structured blog-like journal was created by the
artist using Annalist33, together with a Procedural Blend
diagram using the model from Garrelfs' thesis. We worked
with the artist to encode the blend diagram as Annalist
records, and in the process were able to re ne the model to
make it more consistently encodable while preserving the
original descriptive intent.
5.3
      </p>
    </sec>
    <sec id="sec-13">
      <title>Chemistry Personas</title>
      <p>Chemistry Personas34 evaluates Annalist as a tool for
capturing records of academic researchers in chemistry, and for
identifying metadata from these records. It was used to
create a set of interfaces for the capture and linking of
information about people, organisations, projects, and resources
associated with experiments such as plans, materials,
equipment, activities, and the experiment records themselves that
incorporate linked data. Designs of the models are based on
research information and associated metadata from
experiment records, and observed recording practices in chemistry
from a range of universities across the world.</p>
      <p>Annalist was found to be useful for capturing research
records and associated data, with the exibility to be easily
adapted to the needs of di erent research groups and
individual researchers. We found the generic capability to create
specialized interfaces for capturing information allowed us
to handle the di erent requirements of di erent domains
and disciplines. The linked-data aspect is particularly useful
in enabling the simple reuse of resources and plans, and
inclusion of frequently used information into the research
records.
5.4</p>
    </sec>
    <sec id="sec-14">
      <title>Canal Cruising Log</title>
      <p>The canal cruising log35 is an example of Annalist used for
personal information management. It captures information
about narrowboat cruising on the English canal network,
with information about daily movements, waterways, places
visited, other interesting locations, and maintenance
activities performed. It is based on a handwritten log book, and
attempts to capture information in searchable form that may
be useful when planning waterways travels. The
information modelling is ad hoc (i.e. uses private vocabulary terms).
Using Annalist, it would be quick and easy to revisit and
add class URIs from standard ontologies. Property URIs
are harder to update, but but work is in progress to support
data migration as properties change.
32http://irisgarrelfs.com/smoke
33http://cream.annalist.net/annalist/c/IG Philadelphia
Project/
34http://cream.annalist.net/annalist/c/Chemistry
Personas/
35http://demo.annalist.net/annalist/c/CruisingLog/
5.5</p>
    </sec>
    <sec id="sec-15">
      <title>Accommodation search</title>
      <p>The accommodation search collection36 is another example
of personal information management, this time with a clear
real-world outcome. We sought a new home for an elderly
relative that would make it easier to provide
increasinglyneeded levels of support. The web made it easy to nd
candidate properties, but there were speci c requirements
(e.g., physical accessibility) not selectable by available search
facilities, so we had to lter from a large number of candidate
properties; further, good properties would come to market
and disappear quickly, so prompt information sharing was
needed. Annalist was used to rapidly create a specialised
database of candidates, with links to existing property web
sites, additional annotations, an an overall scoring of
suitability according to our particular requirements. This was
shared among family members, and when the ideal property
appeared we were able to consult over the details and arrange
an early visit.
6.
6.1</p>
    </sec>
    <sec id="sec-16">
      <title>DISCUSSION</title>
    </sec>
    <sec id="sec-17">
      <title>Evaluation of requirements</title>
      <p>Section 2 set out a number of requirements arising mainly
from past experiences creating FlyTED. We now review those
requirements against the implemented Annalist applications
described in section 5.</p>
      <p>Requirements R2 (no programming), R6 (data sharing),
R9 (use of unmodi ed software) and R10 (expose data in
a standard format) are demonstrated by all of the
implementations described. Our work on these implementations
has repeatedly exploited R8 (portability of data) and R12
(o ine working) by using a GitHub repository37 to transfer
work-in-progress data between an o ine laptop and online
servers; this use of Github for data exchange, backup and
versioning also demonstrates another aspect of R10
(sustainability of data). Annalist's exposure of underlying data
(R11) is present in all the applications described, but has
not been signi cantly exploited by independent software; this
will be tested in future work.</p>
      <p>The implementation of The Carolan Guitar data has shown
extensive use of R4 and R7 (choice and mixing of existing
RDF vocabularies), combining PROV, CIDOC CRM,
FRBRoo and some private terms. Requirement R5 (evolution
of structure) was used when working through the Carolan
Guitar's history, starting with its construction, but in later
stages dealing with musical performances and compositions.</p>
      <p>
        The The Canal Cruising Log implementation in particular
demonstrates R3 (not needing knowledge of RDF) as this
has been developed without reference to any speci c RDF
terms or characteristics. The frame-oriented approach to
data presentation appears to be more approachable than
requiring users to work with the RDF graph/triple structure
(also suggested by Kalus[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]). There is one aspect where
RDF in uences remain visible: eld descriptions must specify
a property string in the form of a URI or CURIE that is
used to identify an attribute in the data (the e ect of which
is to constrain the syntax of attribute identifying strings).
      </p>
      <p>The Accommodation search application demonstrates the
achievement of R1 (to quickly create a collection and start
36http://demo.annalist.net/annalist/c/Accommodation
search/
37https://github.com/ninebynine
capturing data) as this ephemeral application would never
have been realized if it would have taken more than an hour
or so to create the collection with some initial data records.
6.2</p>
    </sec>
    <sec id="sec-18">
      <title>Further observations</title>
      <p>Primary storage of data as simple text resources:
there is no database or triple store behind Annalist. For
locally stored resources, each addressable resource is stored
as a single JSON-LD le, and the request URL is mapped to
a corresponding lename. The underlying resources may be
served directly by a web server without any Annalist
deployment being present, which we believe could be a boon for
long-term sustainability of research data (e.g., the Annalist
demo site38 also o ers links39 that connect directly to the
underlying data). This approach allows collections to be
versioned using common version management tools (e.g. git),
and shared via version repositories.</p>
      <p>Edit con icts: Annalist handles concurrent updates to
individual entities by atomic updates. There is an
unimplemented design for warning when an entity changes during
editing. Consistency of values between entities is not
currently enforced (see section 4).</p>
      <p>Performance: the Annalist demonstrator runs on a
modestly provisioned virtual machine (1 virtual CPU, 2Gb RAM).
We have not undertaken formal measurements, but have not
found performance to be a limitation in day-to-day use. This
matches our expectation that the underlying Linux le
system is very e cient for accessing small les that comprise
Annalist collections. Some particular operations perform
repeated le accesses, and future work is planned to optimize
these cases. Planned developments will introduce an index
alongside the at les, probably a triple store, to support
e cient search and query over the data.</p>
      <p>Data types for organisation, views to de ne
structure: traditional database systems use data types (or
equivalent) both to categorize data records and to associate
structure with the data (e.g. a relational table de nes the
structure of each row in a table). This is less true for schema-free
databases (e.g. MongoDB40), but even here we may see that
structural features such as indexes are associated with what
is e ectively a data type (e.g. in MongoDB, a \collection").
With Annalist, types are used simply to categorize data
records, and the structure of any record is determined by the
view (or views) that are used to create or edit it. This means
that di erent views can be used with a record type, according
to the context (e.g., considering a person as an employee
or as a customer). (When editing a record with elds not
referenced by the view used, those elds are unchanged when
the record is saved.)</p>
      <p>Collection portability: URLs and URIs: moving a
collection between Annalist deployments means that the
URLs used to access records can change. But for some elds,
such as type descriptions, we need stable identi ers that
don't change with location of the data. The implementation
of Annalist recognizes this tension, and distinguishes between
URLs used to access and retrieve resources and URIs used
to identify them. This goes somewhat against the grain of
web wisdom, and in practice the distinction is used quite
sparingly, but we believe this illustrates that in pragmatic
applications, particularly where there may be copies of the
38http://demo.annalist.net/
39http://annalist.net/annalist sitedata/
40https://www.mongodb.com/
same information in di erent locations, the distinction may
have some value. (This di ers from earlier discussions about
URNs and URIs41, in that the distinction is in no way
dependent on the URI scheme used: a given HTTP URI may
be a URI or URL depending on how it is used.)</p>
      <p>Usability: we have not yet undertaken formal usability
studies, but our experience to date indicates that it is possible
to quickly create data managment forms that are usable
with no special knowledge or experience. We have found
that creating structure and view de nitions can become
complicated when there are multiple relationships between
entity types, and improvements in this area are ongoing.</p>
      <p>Scale: Annalist deals with collections of modest size, but
through the milieu of the web even such modest collections,
in su cient numbers, may contribute to data at much greater
scales. Tools, like Annalist, that facilitate creation of linked
data at local scales may be a key to enabling fully distributed
datasets at web scale.
7.</p>
    </sec>
    <sec id="sec-19">
      <title>FURTHER WORK</title>
      <p>The nature of Annalist as a generic tool means there is
inevitably far more that could be done than has been achieved
to date. Work-in-progress enhancements include: modular
type/view de nitions, importing de nitions from a
predened collection, which we see as allowing end-users to get
started even more quickly with generating their own linked
data (e.g. using \canned" de nitions for bibliographic and
provenance information); usability improvements to
streamline common data entry tasks (e.g. automatic creation of
default views and lists associated with a data type); data
migration facilities to assist with applying vocabulary changes
to existing data.</p>
      <p>Work is currently underway to create an independent
frontend for presentation of musical performance data created
using Annalist (based on the structures developed for the
Carolan Guitar), which we plan to use to develop a
demonstration system aimed at enhancing audience experience of
live music concerts. This will provide an exemplar of how
Annalist may be used as part of a larger ensemble of tools
for creating and deploying applications using linked data.</p>
      <p>We have noted that Annalist is not a \big data" system, and
that design choices may constrain the e ective size of a single
Annalist collection. But by creating multiple independent,
cross-linked and web-searchable data, we anticipate that
Annalist collections may be combined with other data sources,
contributing to creation of linked data resources at larger
scale. One way to explore this idea would be to use Annalist
to reinstate public access to the FlyTED data. Some initial
explorations are under way, and successful achievement of
this could provide a particularly compelling evaluation of the
Annalist principles and design.</p>
      <p>Looking ahead, we anticipate creation of \data bridges"
to allow existing data (especially in spreadsheets) to be
presented as linked data through Annalist; this might extend
further to real-time data acquisition, with data from sensors
like GPS or real-time feeds.</p>
      <p>
        The current implementation of Annalist uses the server
host le system for data storage, but it was an original goal
that Annalist could work with third-party storage. A
candidate for this would be a Linked Data Platform (LDP)[
        <xref ref-type="bibr" rid="ref32">32</xref>
        ]
server, though there are matters of access control to be
41http://www.w3.org/TR/uri-clari cation/
resolved. The Social Linked Data (Solid) project42 uses
WebId43 for access control, and could provide an promising
opportunity if it is possible to devise a mechanism to link
OpenID Connect authentication with WebId authorization.
      </p>
      <p>
        Other areas of possible future work for Annalist include
provenance[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] recording and support for provenance
pingbacks[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] to recognize downstream use of Annalist data.
      </p>
      <p>
        Looking further ahead, we note that the Annalist dynamic
form generator has evolved in a somewhat ad hoc fashion, in
response to evolving recognition of requirements. As noted
in section 3.2, theoretical work on \form lenses"[
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] might
be adopted to provide a more principled grounding for this
aspect of Annalist.
      </p>
      <p>Finally, we note that while Annalist has been an open
development from its outset, it has been signi cantly conducted
so far by an \open community" of just one developer. For a
viable future we must engage a wider community of users
and developers to establish a more long-term sustainable
project. Contributions will be most welcome!</p>
    </sec>
    <sec id="sec-20">
      <title>CONCLUSIONS</title>
      <p>Annalist has been used in a number of research projects for
prototyping linked data information structures and, even as
a work-in-progress, has proven exible and robust in use by a
small number of diverse users, with a low cost to get started
with a new collection. It has also proved e ective in personal
information management projects involving annotation of
existing web resources and sharing structured data on the
Web. While many of the capabilities of Annalist are provided
by other systems, we are not aware of any other that combines
the key features of Annalist in a package that can be used
\out of the box" for data management.</p>
      <p>We have noticed that, while Annalist is easy to use for
basic data entry and browsing, developing more complex
structures requires greater e ort, and does bene t from an
awareness of the RDF model (particularly with respect to
the use of URIs, or CURIEs, to identify things, classes of
things and relations between things). But this e ort can be
applied incrementally, yielding rapid bene ts and feedback,
supporting agility in creating information designs.</p>
      <p>
        Annalist's exible approach to information structuring has
permitted an approach that di ers from that often used when
creating databases (e.g. for research data), starting with a
very loosely structured narrative and progressively re ning
structured information around that narrative. In this, we feel
that we have created a tool that goes some way to achieving
Karger's objectives for semantic web applications, viz. \to
work e ectively over whatever schemas their users choose to
create or import"[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
    </sec>
    <sec id="sec-21">
      <title>ACKNOWLEDGEMENTS</title>
      <p>The development and evaluation of Annalist has been
supported in part by EPSRC EP/L019981/1 Fusing Semantic
and Audio Technologies for Intelligent Music Production and
Consumption and by the JISC-funded Research Data Spring
CREAM project44.
42https://github.com/solid/solid-spec
43http://www.w3.org/2005/Incubator/webid/spec/
identity/
44https://blog.soton.ac.uk/cream/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Bakke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Karger</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>A spreadsheetbased user interface for managing plural relationships in structured data</article-title>
          .
          <source>In Proc. SIGCHI conference on human factors in computing systems. In CHI '11. ACM</source>
          , Vancouver,
          <year>2011</year>
          , pp.
          <volume>2541</volume>
          {
          <fpage>2550</fpage>
          . doi:
          <volume>10</volume>
          .1145/1978942. 1979313.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bekiari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Doerr</surname>
          </string-name>
          , and P. Le B uf, eds. FRBR - object-oriented de nition and mapping to FRBRer
          <source>(version 1.0)</source>
          . 1.0 ed.
          <source>May</source>
          <year>2009</year>
          ,
          <volume>1</volume>
          .0 ed.,
          <year>2009</year>
          . url: http://cidoc.ics.forth.gr/docs/frbr oo/frbr docs/ FRBRoo V1.
          <article-title>0 draft 2009 may</article-title>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Benford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hazzard</surname>
          </string-name>
          , et al.
          <article-title>Augmenting a guitar with its digital footprint</article-title>
          .
          <source>In Proc. international conference on new interfaces for musical expression</source>
          . Louisiana State University,
          <year>2015</year>
          , pp.
          <volume>303</volume>
          {
          <fpage>306</fpage>
          . url: http://www. nime.org/proceedings/2015/nime2015 264.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fielding</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Masinter</surname>
          </string-name>
          . RFC 3986,
          <string-name>
            <surname>Uniform</surname>
            <given-names>Resource</given-names>
          </string-name>
          <article-title>Identi er (URI): Generic Syntax</article-title>
          .
          <year>2005</year>
          . url: https://tools.ietf.org/html/rfc3986.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Brickley</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Guha</surname>
          </string-name>
          .
          <source>RDF schema 1.1. W3C Recommendation. World Wide Web Consortium, Feb</source>
          .
          <year>2014</year>
          . url: http://www.w3.org/TR/rdf-schema/.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Buneman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cheney</surname>
          </string-name>
          , et al.
          <article-title>The Database Wiki project: a general-purpose platform for data curation and collaboration</article-title>
          .
          <source>SIGMOD record</source>
          ,
          <volume>40</volume>
          (
          <issue>3</issue>
          ):
          <volume>15</volume>
          {
          <fpage>20</fpage>
          ,
          <year>2011</year>
          . doi:
          <volume>10</volume>
          .1145/2070736.2070740.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Catton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sparks</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Shotton</surname>
          </string-name>
          .
          <article-title>Chapter 21: publishing and nding images in the BioImage Database, an image database for biologists</article-title>
          .
          <source>In, Cell biology (third edition)</source>
          , pp.
          <volume>207</volume>
          {
          <fpage>216</fpage>
          . Academic Press, Burlington, third edition ed.,
          <year>2006</year>
          . doi:
          <volume>10</volume>
          .1016/B978- 012164730-8/
          <fpage>50149</fpage>
          -0.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ciccarese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ocana</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Clark</surname>
          </string-name>
          .
          <article-title>Open semantic annotation of scienti c publications using DOMEO. J. biomedical semantics, 3(S-1</article-title>
          ):
          <fpage>S1</fpage>
          ,
          <year>2012</year>
          . url: http : //www.jbiomedsem.com/content/3/S1/S1.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wood</surname>
          </string-name>
          , and M.
          <source>Lanthaler. RDF 1</source>
          .
          <article-title>1 concepts and abstract syntax</article-title>
          .
          <source>W3C Recommendation. Feb</source>
          .
          <year>2014</year>
          . url: http : / / www . w3 . org / TR / rdf11 - concepts/.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>E. W.</given-names>
            <surname>Deutsch</surname>
          </string-name>
          et al.
          <article-title>Development of the minimum information speci cation for in situ hybridization and immunohistochemistry experiments (MISFISHIE)</article-title>
          .
          <source>Omics: a journal of integrative biology</source>
          ,
          <volume>10</volume>
          (
          <issue>2</issue>
          ):
          <volume>205</volume>
          {
          <fpage>208</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Doerr</surname>
          </string-name>
          .
          <article-title>The CIDOC conceptual reference module: an ontological approach to semantic interoperability of metadata</article-title>
          .
          <source>AI mag.</source>
          ,
          <volume>24</volume>
          (
          <issue>3</issue>
          ):
          <volume>75</volume>
          {
          <fpage>92</fpage>
          ,
          <string-name>
            <surname>Sept</surname>
          </string-name>
          .
          <year>2003</year>
          . url: http://dl.acm.org/citation.cfm?id=
          <volume>958671</volume>
          .
          <fpage>958678</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>G. dos Santos</surname>
          </string-name>
          et al.
          <article-title>Flybase: introduction of the Drosophila Melanogaster release 6 reference genome assembly and large-scale migration of genome annotations</article-title>
          .
          <source>Nucleic acids research</source>
          ,
          <volume>43</volume>
          (
          <string-name>
            <surname>Database-Issue</surname>
          </string-name>
          ):
          <volume>690</volume>
          {
          <fpage>697</fpage>
          ,
          <year>2015</year>
          . doi:
          <volume>10</volume>
          .1093/nar/gku1099.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Frey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Milsted</surname>
          </string-name>
          , et al.
          <article-title>Labtrove: a lightweight, web based, laboratory blog as a route towards a marked up record of work in a bioscience research laboratory</article-title>
          .
          <source>Plos one</source>
          ,
          <volume>8</volume>
          (
          <issue>7</issue>
          ):e67460{[18pp],
          <year>July 2013</year>
          . doi:
          <volume>10</volume>
          .1371/ journal.pone.0067460. url: http://eprints.soton.ac.uk/ 355078/.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>I. Garrelfs.</surname>
          </string-name>
          <article-title>From inputs to outputs: an investigation of process in sound art practice</article-title>
          .
          <source>PhD thesis</source>
          . University of the Arts London, May
          <year>2015</year>
          . url: http://irisgarrelfs. com/thesis.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Groth</surname>
          </string-name>
          and
          <string-name>
            <surname>L. Moreau.</surname>
          </string-name>
          <article-title>PROV overview: an overview of the PROV family of documents</article-title>
          . W3C Working Group Note. W3C, Apr.
          <year>2013</year>
          . url: http : / / www . w3.org/TR/prov-overview/.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>C.</given-names>
            <surname>Gutteridge</surname>
          </string-name>
          .
          <article-title>GNU EPrints 2 overview</article-title>
          . In 11th panhellenic academic libraries conference.
          <source>Event Dates:</source>
          <year>2002</year>
          ,
          <year>2002</year>
          . url: http://eprints.soton.ac.uk/256840/.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>D.</given-names>
            <surname>Huynh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mazzocchi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Karger</surname>
          </string-name>
          .
          <article-title>Piggy Bank: experience the semantic web inside your web browser</article-title>
          .
          <source>Web semantics: science, services and agents on the World Wide Web</source>
          ,
          <volume>5</volume>
          (
          <issue>1</issue>
          ),
          <year>2007</year>
          . doi:
          <volume>10</volume>
          .1016/j.websem.
          <year>2006</year>
          .
          <volume>12</volume>
          .002.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kalus</surname>
          </string-name>
          .
          <article-title>Semantic networks and historical knowledge management: introducing new methods of computerbased research</article-title>
          .
          <source>Journal of the Association for History and Computing</source>
          ,
          <volume>10</volume>
          (
          <issue>3</issue>
          ), Dec.
          <year>2007</year>
          . url: http : / / hdl . handle.net/2027/spo.3310410.0010.301.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>D.</given-names>
            <surname>Karger</surname>
          </string-name>
          .
          <article-title>Keynote at ESWC part 2: how the semantic web can help end users</article-title>
          .
          <source>Tech. rep. MIT CSAIL Research</source>
          ,
          <year>2013</year>
          . url: http://haystack.csail.mit.edu/ blog/2013/06/06/keynote- at
          <string-name>
            <surname>-</surname>
          </string-name>
          eswc
          <string-name>
            <surname>-</surname>
          </string-name>
          part- 2
          <article-title>- how- thesemantic-web-can-help-end-users/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>D.</given-names>
            <surname>Karger</surname>
          </string-name>
          .
          <article-title>Keynote at the ESWC part 1: the state of end user information management</article-title>
          .
          <source>Tech. rep. MIT CSAIL Research</source>
          ,
          <year>2013</year>
          . url: http : / / haystack . csail . mit.edu/blog/2013/06/05/keynote-at-the-europeansemantic
          <string-name>
            <surname>-</surname>
          </string-name>
          web
          <string-name>
            <surname>-</surname>
          </string-name>
          conference
          <string-name>
            <surname>-</surname>
          </string-name>
          part- 1
          <string-name>
            <surname>-</surname>
          </string-name>
          the- state
          <string-name>
            <surname>-</surname>
          </string-name>
          of
          <string-name>
            <surname>-</surname>
          </string-name>
          enduser
          <string-name>
            <surname>-</surname>
          </string-name>
          information-management/.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>G.</given-names>
            <surname>Klyne</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Groth.</surname>
          </string-name>
          PROV-AQ:
          <article-title>provenance access and query</article-title>
          . W3C Working Group Note. W3C, Apr.
          <year>2013</year>
          . url: http://www.w3.org/TR/prov-aq/.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>P. Le B uf,</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Doerr</surname>
          </string-name>
          , et al.
          <article-title>De nition of the CIDOC conceptual reference model</article-title>
          ,
          <source>version V6.2. Tech. rep. International Council of Museums</source>
          , May
          <year>2015</year>
          . url: http://www.cidoc-crm.
          <source>org/docs/cidoc crm version 6</source>
          . 2.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>T.</given-names>
            <surname>Lebo</surname>
          </string-name>
          et al.
          <article-title>PROV-O: the PROV ontology</article-title>
          .
          <source>W3C Recommendation. W3C</source>
          , Apr.
          <year>2013</year>
          . url: http://www. w3.org/TR/prov-o/.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>C.</given-names>
            <surname>McBride</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Paterson</surname>
          </string-name>
          .
          <article-title>Applicative programming with e ects</article-title>
          .
          <source>J. funct. program.</source>
          ,
          <volume>18</volume>
          (
          <issue>1</issue>
          ):1{
          <fpage>13</fpage>
          ,
          <string-name>
            <surname>Jan</surname>
          </string-name>
          .
          <year>2008</year>
          . doi:
          <volume>10</volume>
          .1017/S0956796807006326.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>A.</given-names>
            <surname>Miles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Shotton</surname>
          </string-name>
          , et al.
          <article-title>OpenFlyData: an exemplar data web integrating gene expression data on the fruit y Drosophila Melanogaster</article-title>
          .
          <source>Journal of biomedical informatics</source>
          ,
          <volume>43</volume>
          (
          <issue>5</issue>
          ):
          <volume>752</volume>
          {
          <fpage>761</fpage>
          ,
          <year>2010</year>
          . doi: http://dx.doi. org/10.1016/j.jbi.
          <year>2010</year>
          .
          <volume>04</volume>
          .00.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>E.</given-names>
            <surname>Pietriga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Karger</surname>
          </string-name>
          , et al.
          <article-title>Fresnel: a browser independent presentation vocabulary for RDF</article-title>
          .
          <source>In Proc. 5th international conference on the semantic web</source>
          .
          <source>In ISWC'06</source>
          . Springer-Verlag, Athens,
          <year>2006</year>
          , pp.
          <volume>158</volume>
          {
          <fpage>171</fpage>
          . doi:
          <volume>10</volume>
          .1007/11926078 12.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rajkumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lindley</surname>
          </string-name>
          , et al.
          <article-title>Lenses for web data</article-title>
          .
          <source>Electronic communications of the EASST</source>
          ,
          <volume>57</volume>
          (
          <issue>57</issue>
          ),
          <year>2013</year>
          . doi:
          <volume>10</volume>
          .14279/tuj.eceasst.
          <volume>57</volume>
          .879.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rauschmayer</surname>
          </string-name>
          .
          <article-title>Connected information management</article-title>
          .
          <source>PhD thesis</source>
          . Ludwig-Maximilians-Universitat Munchen, Feb.
          <year>2010</year>
          . url: http://nbn-resolving.de/urn:nbn:de: bvb:
          <fpage>19</fpage>
          -
          <lpage>114390</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rauschmayer</surname>
          </string-name>
          . Hyena RDF editor.
          <source>Poster. Displayed at SemWiki</source>
          <year>2008</year>
          , 3rd Semantic Wiki Workshop: The Wiki Way of Semantics.
          <year>2008</year>
          . url: http://ceur- ws. org/Vol-
          <volume>360</volume>
          /poster-11.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30] m. schraefel
          <string-name>
            <given-names>and D.</given-names>
            <surname>Karger</surname>
          </string-name>
          .
          <article-title>The pathetic fallacy of RDF</article-title>
          . In International workshop on
          <article-title>the semantic web and user interaction (SWUI) 2006</article-title>
          . url: http://eprints. soton.ac.uk/262911/.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>D. M. Shotton</surname>
            ,
            <given-names>S</given-names>
          </string-name>
          <string-name>
            <surname>Sparks</surname>
            , and
            <given-names>C</given-names>
          </string-name>
          <string-name>
            <surname>Catton</surname>
          </string-name>
          .
          <article-title>An introduction to the BioImage database, an image database for biologists</article-title>
          .
          <source>In Proc. 4th european light microscopy initiative meeting, Gothenburg</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>S.</given-names>
            <surname>Speicher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Arwe</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Malhotra</surname>
          </string-name>
          .
          <article-title>Linked data platform 1.0</article-title>
          .
          <string-name>
            <given-names>W3C</given-names>
            <surname>Recommendation. Feb</surname>
          </string-name>
          .
          <year>2015</year>
          . url: http://www.w3.org/TR/ldp/.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sporny</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Longley</surname>
          </string-name>
          , et al.
          <article-title>JSON-LD 1.0: a JSONbased serialization for linked data</article-title>
          .
          <source>W3C Recommendation. Jan</source>
          .
          <year>2014</year>
          . url: http://www.w3.org/TR/json-ld/.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <article-title>Stanford Center for Biomedical Informatics Research (BMIR). Protege ontology editor</article-title>
          . url: http://protege. stanford.edu.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Swedlow</surname>
          </string-name>
          et al.
          <article-title>The Open Microscopy Environment (OME) Data Model and XML le: open tools for informatics and quantitative analysis in biological imaging</article-title>
          .
          <source>Genome biology</source>
          ,
          <volume>6</volume>
          (
          <issue>5</issue>
          ):R47{
          <fpage>R47</fpage>
          ,
          <year>2005</year>
          . doi:
          <volume>10</volume>
          .1186/gb-2005-6-5-r47.
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>C.</given-names>
            <surname>Voegele</surname>
          </string-name>
          et al.
          <article-title>A universal open-source electronic laboratory notebook</article-title>
          .
          <source>Bioinformatics</source>
          ,
          <volume>29</volume>
          (
          <issue>13</issue>
          ):
          <volume>1710</volume>
          {
          <fpage>1712</fpage>
          ,
          <year>2013</year>
          . doi:
          <volume>10</volume>
          .1093/bioinformatics/btt253.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>M.</given-names>
            <surname>Vo</surname>
          </string-name>
          <article-title>lkel and M. a. Krotzsch. Semantic wikipedia</article-title>
          .
          <source>In Proc. 15th international conference on World Wide Web. ACM</source>
          , Edinburgh, Scotland,
          <year>2006</year>
          , pp.
          <volume>585</volume>
          {
          <fpage>594</fpage>
          . doi:
          <volume>10</volume>
          .1145/1135777.1135863.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>K.</given-names>
            <surname>Wolstencroft</surname>
          </string-name>
          et al.
          <article-title>RightField: semantic enrichment of systems biology data using spreadsheets</article-title>
          .
          <source>In 8th IEEE international conference on e-science, e-science</source>
          <year>2012</year>
          , Chicago, IL, USA, October 8-
          <issue>12</issue>
          ,
          <year>2012</year>
          , pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          . doi:
          <volume>10</volume>
          .1109/eScience.
          <year>2012</year>
          .
          <volume>6404412</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Klyne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Benson</surname>
          </string-name>
          , E. Gudmannsdottir,
          <string-name>
            <given-names>H.</given-names>
            <surname>White-Cooper</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Shotton</surname>
          </string-name>
          .
          <article-title>FlyTED: the Drosophila testis gene expression database</article-title>
          .
          <source>Nucleic acids research</source>
          ,
          <volume>38</volume>
          (
          <issue>Suppl</issue>
          . 1):D710{
          <fpage>D715</fpage>
          ,
          <year>2010</year>
          . issn:
          <fpage>0305</fpage>
          -
          <lpage>1048</lpage>
          . doi:
          <volume>10</volume>
          .1093/nar/gkp1006.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Klyne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gamble</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Goble</surname>
          </string-name>
          .
          <article-title>A checklist-based approach for quality assessment of scienti c information</article-title>
          .
          <source>In 3rd international workshop on linked science (LISC2013)</source>
          ,
          <year>2013</year>
          . url: http://linkedscience. org/wp-content/uploads/2013/04/paper5.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Klyne</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Shotton</surname>
          </string-name>
          .
          <article-title>Building a semantic web image repository for biological research images</article-title>
          . In S. Bechhofer,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hauswirth</surname>
          </string-name>
          , J. Ho mann, and M. Koubarakis, editors,
          <source>The semantic web: research and applications</source>
          . Vol.
          <volume>5021</volume>
          , in Lecture Notes in Computer Science, pp.
          <volume>154</volume>
          {
          <fpage>169</fpage>
          . Springer Berlin Heidelberg,
          <year>2008</year>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>540</fpage>
          -68234-9
          <fpage>14</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>