<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Bureau for Rapid Annotation Tool: Collaboration can do More over Variety-oriented Annotations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zheng Wang</string-name>
          <email>wangz@istic</email>
          <email>wangz@istic:ac:cn</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shuo Xu</string-name>
          <email>xushuo@bjut</email>
          <email>xushuo@bjut:edu:cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Beijing University of Technology</institution>
          ,
          <addr-line>Chaoyang District, Beijing</addr-line>
          ,
          <country country="CN">P. R. China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Scientific and Technical Information of China</institution>
          ,
          <addr-line>Haidian District, Beijing</addr-line>
          ,
          <country country="CN">P. R. China</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <fpage>80</fpage>
      <lpage>82</lpage>
      <abstract>
        <p>A high-quality manually annotated corpus is crucial for many text mining and information extraction tasks. Several workbenches have been developed in the literature to facilitate collaborative annotation. However, given the growing volumes of un-annotated documents, these variety-oriented annotation workbenches have many shortcoming in terms of teamwork, quality control and time effort. For this purpose, we develop a novel workbench such that collaboration can do more over variety-oriented annotation. Our workbench is named as Bureau for Rapid Annotation Tool (Brat for short). Main functionalities include enhanced semantic constraint system, Vim-like shortcut keys, annotation filter and graph-visualizing annotation browser. Until now, over 500,000 mentions have been annotated with our Brat workbench.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        A high-quality manually annotated corpus is very crucial for many
text mining and information extraction tasks [
        <xref ref-type="bibr" rid="ref1 ref14 ref15 ref3 ref4 ref6 ref7">1, 3, 4, 6, 7, 14, 15</xref>
        ].
Several workbenches have been developed in the literature to
facilitate collaborative annotation [
        <xref ref-type="bibr" rid="ref11 ref13 ref16 ref8">8, 11, 13, 16</xref>
        ]. However, given the
growing volumes of un-annotated documents, these variety-oriented
workbenches still have many shortcomings in terms of teamwork,
quality control and time effort. Let’s take the sentence
”Depending on the model, a Tesla costs somewhere between 1 and 3.33
BTC” [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] as an example. A practical issue we face is whether or
not to assign ”Price” type to the mention ”between 1 and 3.33 BTC”.
This actually depends on a consensus acknowledging Bitcoin as
actual money [
        <xref ref-type="bibr" rid="ref17 ref5">5, 17</xref>
        ].
      </p>
      <p>
        Reaching this consensus is extremely time-consuming and
heavily rely on two types of annotation collaborations in Table 1:
grounded collaboration and trusted collaboration. By grounded
collaboration, we mean that the resulting annotators are restricted with
sounded pre-arrangements. For example, U-Compare only supports
named entity annotations in the UIMA-type system [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], which can
avoid many conflicts. An alternative [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] takes the form of
semantic constraints. In more details, a certain relationship should take
parameters with specific entity types. As for trusted collaboration,
YEDDA [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] recognized common gestures from BRAT [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and
embedded many functionalities including teamwork, multi-annotator
analysis and pairwise annotators comparison. Then, on the basis
of various annotations, the inter-project agreement can be
calculated. Another strategy of trusted collaboration, user-independent
workspace, was utilized in TeamTat [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>This paper combines these two types of annotation
collaborations to structure various mentions annotated by each annotator</p>
      <p>
        Grounded
UIMA-type system[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
Annotation semantic constraint [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
      </p>
      <p>
        Trusted
Teamwork [
        <xref ref-type="bibr" rid="ref16 ref8">8, 16</xref>
        ]
Personal workspace TeamTat [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]
Multi-annotator analysis [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
      </p>
      <p>
        Pairwise annotators comparison [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
and develop a workbench named as Bureau for Rapid Annotation
Tool (Brat for short). Main functionalities include enhanced
semantic constraint system, Vim-like shortcut keys, annotation filter and
graph-visualizing annotation browser. Until now, over 500,000
mentions have been annotated with our Brat workbench.
It is well known that not all parameters are valid to a specific
relationship. To limit invalid annotated results for an annotation project,
its manager can customize the schema at any time. Once the schema
is modified, all involved annotated mentions will be adjusted
correspondingly. A readable name is usually assigned to each type of
entity and relation. In addition, a list of rules are also attached to
expression the constraint conditions between parameters in each type
of relation. In this way, the understanding on entities and relations
from the manager can be delivered to all annotators.
2.2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Vim-like Shortcut Key</title>
      <p>
        According to our observation, the conventional annotating
operations (marking, selecting and confirming [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]) is time-expensive to
choose a proper candidate from more than 5 entity types or
relationships. To speed up the annotation procedure, our workbench
embeds many Vim-like shortcut keys [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In this time, one can
annotate smoothly an entity by the following steps (cf. Figure 2): 1) to
move cursor and select a span of text with Figure 1, 2) to
acknowledge one command from recommended candidates with TAB and
ENTER, 3) to type leading characters and confirm entity types.
Similar operations can be followed for relation mention annotation. It is
worth noting that the key feature of this functionality is code
autocompletion. This is based on enhanced semantic constraint system
and polymorphic type inference [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
2.3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Configurable Annotation Filter</title>
      <p>with interested types in current workspace. Thereupon, we provide
a configurable annotation filter by toggling or un-toggling entity
types and relationships.
2.4</p>
    </sec>
    <sec id="sec-4">
      <title>Graph-visualizing Browser</title>
      <p>In real-world scenario, it is not trivial to reach an agreement when
multiple annotators are involved, and an entity or relation is
mentioned simultaneously in multiple documents. To inspect the
underling disagreements, our workbench can load and index all texts,
mentions and their types, and then visualize them in a graph
browser, as illustrated in Figure 3.
3</p>
    </sec>
    <sec id="sec-5">
      <title>CONCLUSION</title>
      <p>Many projects utilized our Brat workbench to annotate interested
entities and/or relations, and inspect potential conflicts over
varietyoriented annotations. Nowadays, over 500,000 mentions have been
annotated with our Brat workbench. In the near future, the
Vimlike shortcut keys will be strengthen further, and machine learning
methods will be incorporated to accelerate conflict inspection.</p>
    </sec>
    <sec id="sec-6">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work is supported partially by the Strategic Priority Research
Program of Chinese Academy of Sciences (Grant No. XDA16040504),
National Key Research &amp; Development Program of China (Grant
No. 2019YFA0707202), and National Natural Science Foundation
of China(Grant No. 71704169 and 72074014). We also thank
Professor Yiming Jing and Rui Zheng for their assistance on how to
understand the collaboration in the field of psychology.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1] So¨ren Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and
          <string-name>
            <given-names>Zachary</given-names>
            <surname>Ives</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>DBpedia: A Nucleus for a Web of Open Data</article-title>
          .
          <source>Lecture Notes in Computer Science 4825 LNCS</source>
          (
          <year>2007</year>
          ),
          <fpage>722</fpage>
          -
          <lpage>735</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Jeff</given-names>
            <surname>Benson</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Here's How Much a Fully Loaded Tesla Model S Will Cost You in Bitcoin</article-title>
          . https://decrypt:co/57071/heres-how
          <article-title>-much-a-fully-loaded-tesla-model-s-will-cost-you-in-bitcoin [</article-title>
          <source>Online; accessed 16-Mars-2021].</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Kurt</given-names>
            <surname>Bollacker</surname>
          </string-name>
          , Colin Evans, Praveen Paritosh, Tim Sturge, and
          <string-name>
            <given-names>Jamie</given-names>
            <surname>Taylor</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge</article-title>
          .
          <source>In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data</source>
          .
          <volume>1247</volume>
          -
          <fpage>1250</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Liang</given-names>
            <surname>Chen</surname>
          </string-name>
          , Shuo Xu, Lijun Zhu, Jing Zhang, Xiao-ping
          <string-name>
            <surname>Lei</surname>
            , and
            <given-names>Guancan</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>A Deep Learning based Method for Extracting Semantic Information from Patent Documents</article-title>
          .
          <source>Scientometrics</source>
          <volume>125</volume>
          ,
          <issue>1</issue>
          (
          <year>2020</year>
          ),
          <fpage>289</fpage>
          -
          <lpage>312</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Vanessa</given-names>
            <surname>Dirwai</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Should Christians Trade Bitcoin And Other Cryptocurrencies? https://preciousearnings:medium:com/should-christians-trade-bitcoin-and-other-cryptocurrencies[</article-title>
          <source>Online; accessed 22-August-2021].</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Xin</given-names>
            <surname>Dong</surname>
          </string-name>
          , Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun,
          <string-name>
            <given-names>and Wei</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Knowledge Vault: a Web-scale Approach to Probabilistic Knowledge Fusion</article-title>
          .
          <source>In Proceedings of the 20th ACM SIGKDD International Conference</source>
          .
          <volume>601</volume>
          -
          <fpage>610</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Oren</given-names>
            <surname>Etzioni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Michele</given-names>
            <surname>Banko</surname>
          </string-name>
          , Stephen Soderland, and
          <string-name>
            <surname>Daniel</surname>
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Weld</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Open Information Extraction from the Web</article-title>
          .
          <source>Commun. ACM</source>
          <volume>51</volume>
          ,
          <issue>12</issue>
          (
          <year>2008</year>
          ),
          <fpage>68</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Rezarta</given-names>
            <surname>Islamaj</surname>
          </string-name>
          , Dongseop Kwon, Sun Kim, and
          <string-name>
            <given-names>Zhiyong</given-names>
            <surname>Lu</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>TeamTat: A Collaborative Text Annotation Tool</article-title>
          . CoRR abs/
          <year>2004</year>
          .11894 (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Steven</surname>
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Jenkins</surname>
          </string-name>
          and
          <string-name>
            <surname>Gary</surname>
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Leavens</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>Polymorphic Type Inference in Scheme</article-title>
          .
          <source>Computer Science Technical Reports</source>
          <volume>75</volume>
          (
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Yoshinobu</given-names>
            <surname>Kano</surname>
          </string-name>
          , William A. Baumgartner Jr.,
          <string-name>
            <surname>Luke</surname>
            <given-names>McCrohon</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Sophia</given-names>
            <surname>Ananiadou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bretonnel Cohen</surname>
          </string-name>
          , Lawrence Hunter, and
          <string-name>
            <surname>Jun'ichi Tsujii</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>UCompare: Share and Compare Text Mining Tools with UIMA</article-title>
          .
          <source>Bioinform</source>
          .
          <volume>25</volume>
          ,
          <issue>15</issue>
          (
          <year>2009</year>
          ),
          <fpage>1997</fpage>
          -
          <lpage>1998</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Mariana</surname>
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Neves</surname>
            and
            <given-names>Ulf</given-names>
          </string-name>
          <string-name>
            <surname>Leser</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>A Survey on Annotation Tools for the Biomedical Literature</article-title>
          .
          <source>Briefings Bioinform</source>
          <volume>15</volume>
          ,
          <issue>2</issue>
          (
          <year>2014</year>
          ),
          <fpage>327</fpage>
          -
          <lpage>340</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Kim</given-names>
            <surname>Schulz</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Hacking Vim: A Cookbook to Get the Most Out of The Latest Vim Editor</article-title>
          .
          <source>Packt Publishing Ltd.</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Pontus</surname>
            <given-names>Stenetorp</given-names>
          </string-name>
          , Sampo Pyysalo, Goran Topic, Tomoko Ohta, Sophia Ananiadou, and
          <string-name>
            <surname>Jun'ichi Tsujii</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>BRAT: A Web-based Tool for NLP-Assisted Text Annotation</article-title>
          .
          <article-title>In Conference of the 13th European Chapter of the Association for Computational Linguistics</article-title>
          , Walter Daelemans, Mirella Lapata, and Llu´ıs Ma`rquez (Eds.).
          <fpage>102</fpage>
          -
          <lpage>107</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Zheng</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shuo Xu</surname>
            ,
            <given-names>and Lijun</given-names>
          </string-name>
          <string-name>
            <surname>Zhu</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Semantic Relation Extraction Aware of N-Gram Features from Unstructured Biomedical Text</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          <volume>86</volume>
          (
          <year>2018</year>
          ),
          <fpage>59</fpage>
          -
          <lpage>70</lpage>
          . https://doi:org/10:1016/j:jbi:
          <year>2018</year>
          :
          <volume>08</volume>
          :
          <fpage>011</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Shuo</surname>
            <given-names>Xu</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Xin</given-names>
            <surname>An</surname>
          </string-name>
          , Lijun Zhu, Yunliang Zhang, and Haodong Zhang.
          <year>2015</year>
          .
          <article-title>A CRF-based System for Recognizing Chemical Entity Mentions (CEMs) in Biomedical Literature</article-title>
          .
          <source>Journal of Cheminformatics 7, Suppl</source>
          <volume>1</volume>
          (
          <year>2015</year>
          ),
          <article-title>S11</article-title>
          . https://doi:org/10:1186/
          <fpage>1758</fpage>
          -2946-7-S1-S11
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Jie</surname>
            <given-names>Yang</given-names>
          </string-name>
          , Yue Zhang,
          <string-name>
            <given-names>Linwei</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Xingxuan</given-names>
            <surname>Li</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>YEDDA: A Lightweight Collaborative Text Span Annotation Tool</article-title>
          .
          <source>In Proceedings of the 56th Annual Meeting Association for Computational Linguistics, Fei Liu and Thamar Solorio (Eds.)</source>
          .
          <fpage>31</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>David</given-names>
            <surname>Yermack</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Chapter 2 - Is Bitcoin a Real Currency? An Economic Appraisal</article-title>
          . In Handbook of Digital Currency, David Lee Kuo Chuen (Ed.). Academic Press, San Diego,
          <fpage>31</fpage>
          -
          <lpage>43</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>