<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Annotopia: An Open Source Universal Annotation Server for Biomedical Research</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Paolo Ciccarese</string-name>
          <email>paolo.ciccarese@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tim Clark</string-name>
          <email>tim_clark@harvard.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Massachusetts General Hospital and Harvard Medical School</institution>
          ,
          <addr-line>Boston MA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Annotopia is an open source, open services platform for creating, managing, manipulating and sharing open annotation using the W3C Open Annotation Data Model. It can create and/or manage annotation of HTML, PDF, and other resources including data and ontology concepts, with text, semantic tags, and other annotation types. It supports fine-grained permissions on annotations. Annotopia is a Swiss-army knife for W3C Open Annotation system developers, eliminating many otherwise challenging backend development tasks.</p>
      </abstract>
      <kwd-group>
        <kwd>annotation</kwd>
        <kwd>biomedical</kwd>
        <kwd>entity recognition</kwd>
        <kwd>semantic web</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Background</title>
      <p>
        Annotation of documents and databases on the Web is a core aspect of the Web’s
interactivity [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], but until recently, annotations have been second-class objects, tied
permanently to the applications and servers that host them [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This is a significant
missing feature for the scientific and biomedical community, which increasingly
relies on the Web as its primary means of knowledge dissemination and group
interaction.
      </p>
      <p>As a result of this feature gap, comments, discussions, semantic tags, references,
and other annotations on biomedical publications, are atomized across disparate
servers and media type based representations. Our goal is to fill this gap, making
annotations first-class, independently-managed objects on the web.</p>
      <p>
        Projects from distributed hypermedia research programs in the 1980’s, upon which
many aspects of the early Web were based, actually had several of these properties
[35]. Berners-Lee’s inspired stripping down of these models into “the simplest thing
that could possibly work” [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], laid the basis for a transformative global, collaborative
development of the Web and its technologies [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], but necessarily removed such
features.
      </p>
      <p>
        In the early 2000’s, W3C’s Annotea project began to attempt restoration of some
of the lost features based on the modern web architecture [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Annotea was a
foundation for later annotation models and systems focused on biomedical annotation [
        <xref ref-type="bibr" rid="ref10 ref9">9,
10</xref>
        ]. Similar models were developed for digital humanities use cases. These
specifications were merged to develop a more diverse, community-based specification, the
W3C Open Annotation Data Model [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], now on standards track in the W3C.
      </p>
      <p>While various annotation tools are now available and in use, current annotation
platforms use different representation formats. Such tools normally provide little or
no means to export the annotation in a usable or reusable fashion. The Open
Annotation model is directed at solving the format interoperability issue, but interoperability
on a large scale has not been showcased yet. It requires the existence of special
tooling to handle storage and distributed integration of the Open Annotation Data Model
(OADM) format annotation, and our analysis indicated that this is a server-side issue.</p>
      <p>Furthermore, updating existing software is never an easy task and creating new
software with an Open Annotation backbone requires significant knowledge of the
OADM specifications and introduces software constraints. It appears that most
annotation efforts focus a lot of their energy on the front-end or client as user interaction is
key for adoption and the technical difficulties to work with different operative
systems and browsers are not trivial.</p>
      <p>Moreover, the annotation projects of which we are aware, all rely on different
back-end software. We argue that developing many different back-ends, which
perform very similar operations, results in higher community costs and in a slower
penetration of the Open Annotation specification. OADM services, if not based on a
common service model, will need to be implemented in several pieces of software before
having different systems communicating and exchanging content.</p>
      <p>Lastly, we also have noted that the existing annotation back-ends implement very
similar features that could be coded once and easily serve multiple clients. Within the
scope of an extensible architecture, services like: (i) Open Annotation compliant
storage, (ii) text mining, (iii) entity recognition, (iv) image analysis and (v) Linked Data
mashups could be implemented once as common services. This approach could
reduce the development time and the cost of future annotation platform whose
developers will be able to focus on new features and components without the necessity of
reinventing the common functionality.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Methods</title>
      <p>Our group developed, and will demonstrate in this workshop, Annotopia: the first
W3C OADM-compliant, biomedically-oriented Open Annotation server, in response
to these challenges. Annotopia is a joint project of researchers at the Massachusetts
General Hospital and Eli Lilly &amp; Company. It is an open-source product
(https://github.com/Annotopia). Annotopia operates as a Swiss Army Knife for
annotation. It facilitates creation of interoperable annotation platforms, by providing an
extensible back-end solution supporting the open W3C standard. Thus it allows
developers to focus effort on client software, reducing development time and resources.</p>
      <p>Annotopia is constructed so that every Annotopia instance can support integration
with (i) multiple annotation clients (ii) other Annotopia servers (iii) other Open
Annotation compliant servers (iv) other non Open Annotation compliant servers (v)
existing text mining services (vi) pre-computed text mining results (vii) ontology
management platforms and custom databases for generating structured annotation (viii)
Linked Data SPARQL end-points and much more.</p>
      <p>
        Annotopia incorporates features and ideas from two other annotation servers we
developed in the recent period: CATCH and Domeo. We have extensively described
Domeo elsewhere [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. CATCH supports HarvardX Massively Open Online Courses
(MOOCs), with textual and video annotation, for classes as large as &gt; 20,000
students.
      </p>
      <p>While CATCH and Domeo focus on annotation of video, images and textual
documents (HTML and PDF), Annotopia allows in addition, annotation of data, or of
anything that is uniquely identifiable, even concepts in ontologies.</p>
      <p>
        Annotopia has been integrated and tested against the Annotator.js client, the
Domeo Web annotation toolkit, and the Utopia PDF viewer [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Annotopia architecture and technologies</title>
      <p>
        Annotopia consists of a modular architecture providing a series of extension points.
Extension points are necessary for handling custom structured annotation types as
well as an always-increasing amount of external services to be integrated with the
platform through appropriate connectors. The core platform is written in Java/Groovy
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] making use of the Grails [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] web application framework. The Grails plugin
infrastructure has been extensively exploited for realizing the modular approach.
      </p>
      <p>A high level view of the architecture is shown in Figure 1.</p>
    </sec>
    <sec id="sec-4">
      <title>Demonstration</title>
      <p>Our demonstration will showcase the annotation storage, search, and textmining
integration capabilities of Annotopia. We will also demonstrate interoperability
between multiple HTML and PDF article representations. We expect in future to be
able to demonstrate direct database annotation as well.
5</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>O</given-names>
            <surname>'Reilly</surname>
          </string-name>
          <string-name>
            <surname>T</surname>
          </string-name>
          :
          <article-title>What Is Web 2.0: Design Patterns and Business Models for the Next Generation of Software</article-title>
          . In.:
          <string-name>
            <surname>O'Reilly Network</surname>
          </string-name>
          ;
          <year>2005</year>
          [http://www.oreillynet.com/lpt/a/6228].
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Ciccarese</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soiland-Reyes</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            <given-names>T</given-names>
          </string-name>
          :
          <article-title>Web Annotation as a First-Class Object</article-title>
          .
          <source>IEEE Internet Computing</source>
          <year>2013</year>
          , Nov/Dec 2013:
          <volume>71</volume>
          .
          <fpage>75</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bechhofer</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goble</surname>
            <given-names>C</given-names>
          </string-name>
          :
          <article-title>COHSE: Conceptual Open Hypermedia Service</article-title>
          . In:
          <article-title>Annotation for the Semantic Web</article-title>
          . Edited by Handschuh S,
          <string-name>
            <surname>Staab</surname>
            <given-names>S</given-names>
          </string-name>
          . Amsterdam: IOS Press;
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Carr</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Roure</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hall</surname>
            <given-names>W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hill</surname>
            <given-names>G</given-names>
          </string-name>
          :
          <article-title>The Distributed Link Service: A Tool for Publishers, Authors and Readers</article-title>
          .
          <source>In: Fourth International World Wide Web Conference: December 11-14</source>
          ,
          <year>1995</year>
          ; Boston, Massachusetts, USA.
          <source>World Wide Web Consortium (W3C)</source>
          <year>1995</year>
          :
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>De Roure</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carr</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hall</surname>
            <given-names>W</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hill</surname>
            <given-names>G</given-names>
          </string-name>
          :
          <article-title>Enhancing the Distributed Link Service for multimedia and collaboration</article-title>
          .
          <source>In: Distributed Computing Systems</source>
          ,
          <year>1997</year>
          , Proceedings of the Sixth IEEE Computer Society Workshop on Future Trends of:
          <fpage>29</fpage>
          -
          <lpage>31</lpage>
          Oct 1997
          <year>1997</year>
          .
          <fpage>330</fpage>
          -
          <lpage>335</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Venners</surname>
            <given-names>B</given-names>
          </string-name>
          :
          <article-title>The Simplest Thing that Could Possibly Work: a Conversation with Ward Cunningham</article-title>
          , Part V.
          <source>Artima Developer</source>
          <year>2004</year>
          [http://www.artima.com/intv/simplest.html].
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Jacobs</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walsh</surname>
            <given-names>N</given-names>
          </string-name>
          :
          <article-title>Architecture of the World Wide Web, Volume One</article-title>
          . In: W3C Recommendation World Wide Web Consortium;
          <year>2004</year>
          [http://www.w3.org/TR/2004/RECwebarch-20041215/].
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kahan</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koivunen</surname>
            <given-names>M-R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prud'Hommeaux</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swick</surname>
            <given-names>RR</given-names>
          </string-name>
          :
          <article-title>Annotea: An Open RDF Infrastructure for Shared Web Annotations</article-title>
          . In: WWW10 International Conference: May
          <year>2001</year>
          2001;
          <article-title>Hong Kong</article-title>
          . World Wide Web Consortium: [http://www10.org/cdrom/papers/488/index.html].
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Ciccarese</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ocana</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castro</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Das</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            <given-names>T</given-names>
          </string-name>
          :
          <article-title>An open annotation ontology for science on web 3.0</article-title>
          .
          <source>J Biomed Semantics</source>
          <year>2011</year>
          ,
          <volume>2</volume>
          (
          <issue>Suppl 2</issue>
          ):
          <fpage>S4</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Ciccarese</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ocana</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            <given-names>T</given-names>
          </string-name>
          :
          <article-title>Open Semantic Annotation of Scientific Publications with DOMEO</article-title>
          .
          <source>Journal of Biomedical Semantics</source>
          <year>2012</year>
          ,
          <volume>3</volume>
          (
          <issue>Suppl 1</issue>
          ):S1 [http://www.jbiomedsem.com/content/3/S1/S1].
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Sanderson</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ciccarese</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sompel</surname>
            <given-names>HVd</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bradshaw</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brickley</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castro</surname>
            <given-names>LJG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cole</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Desenne</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gerber</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Isaac</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jett</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Habing</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haslhofer</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellmann</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hunter</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leeds</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Magliozzi</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morris</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morris</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ossenbruggen</surname>
            <given-names>Jv</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soiland-Reyes</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Whaley</surname>
            <given-names>D</given-names>
          </string-name>
          :
          <article-title>W3C Open Annotation Data Model</article-title>
          ,
          <source>Community Draft, 08 February</source>
          <year>2013</year>
          .
          <source>W3C</source>
          <year>2013</year>
          [http://www.openannotation.org/spec/core/].
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Attwood</surname>
            <given-names>TK</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kell</surname>
            <given-names>DB</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McDermott</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marsh</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pettifer</surname>
            <given-names>SR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thorne</surname>
            <given-names>D</given-names>
          </string-name>
          :
          <article-title>Utopia documents: linking scholarly literature with research data</article-title>
          .
          <source>Bioinformatics</source>
          <year>2010</year>
          ,
          <volume>26</volume>
          (
          <issue>18</issue>
          ):
          <fpage>i568</fpage>
          -
          <lpage>i574</lpage>
          [http://bioinformatics.oxfordjournals.org/content/26/18/i568.abstract].
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Henry</surname>
            <given-names>K</given-names>
          </string-name>
          :
          <article-title>A crash overview of groovy</article-title>
          .
          <source>Crossroads</source>
          <year>2006</year>
          ,
          <volume>12</volume>
          (
          <issue>3</issue>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Rocher</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brown</surname>
            <given-names>J</given-names>
          </string-name>
          : The Definitive Guide to GRAILS. Berkeley CA: Apress;
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>