<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>RelVis: Benchmarking OpenIE Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rudolf Schneider</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tom Oberhauser</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tobias Klatt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Felix A. Gers</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander Loser</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Beuth University of Applied Sciences</institution>
          ,
          <addr-line>Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We demonstrate RelVis, a toolkit for benchmarking Open Information Extraction(OIE) systems. RelVis enables the user to perform a comparative analysis among OIE systems like ClausIE, OpenIE 4.2, Stanford OpenIE or PredPatt. It features an intuitive dashboard that enables a user to explore annotations created by OIE systems and evaluate the impact of ve common error classes. Our comprehensive benchmark contains four data sets with overall 4522 labeled sentences and 11243 binary or n-ary OIE relations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>1 video demonstration: https://www.youtube.com/watch?v=Hs87hIe-HEs</p>
    </sec>
    <sec id="sec-2">
      <title>Demo Walkthrough and Exploration</title>
      <p>Startup. On system initialization, RelVis reads gold-annotations and performs a
quantitative evaluation. Next, the system stores extraction- and gold annotations
in a RDBMS.</p>
      <p>Dashboards for exploring annotations. Now, the user can start exploring
results and understanding the behaviour of each system. Figure 1 visualizes in a
web-based dashboard sentences, precision, recall and F scores for each OIE
system and for each error class. RelVis plots error distributions as a Kiviat diagram
and draws bar charts for comparing error class impacts for each OIE system. In
addition, the user can export results as tables and CSV les from the database.</p>
      <p>Understanding and adding a single annotation. RelVis visualizes OIE
extractions on sentence level. For each hit by a system, the user can drill down into a
single sentence and can understand extraction predicates, in green, or arguments,
in blue color, as shown on Figure 2.</p>
      <p>
        Next, she can dive down into correct or incorrect annotations, can add
labels for error classes of incorrect annotations or may leave a comment, see also
Figure 2. We permit the user to apply multiple error classes to each subpart of
an annotation. Next, she can focus on a sentence of interest and can compare
extractions between di erent OIE systems. If no gold annotations are available
the user can create them using RelVis. Note that such a process is also feasible
with standard annotation tools, such as BRAT [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. However in practice, we
noted that such standard tools require a lot of con guration steps to adapt to
OIE-relations. The user selects a sentence to annotate and starts with the rst
annotation by clicking on the "Add new OIE Relation" button. Next, she marks
NYT-nary n-ary News 222
WEB-500 binary Web 500
PENN-100 binary Mixed 100
OIE2016 n-ary Wiki 3200
the predicate and arguments in the sentence for her rst annotation by selecting
them with the cursor.
      </p>
      <p>
        Prede ned common error classes. Over the years, di erent error classes have
been de ned for evaluating OIE systems. We identi ed the following ve types of
errors as most relevant in our previous work [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. (1) Wrong Boundaries indicate
too large or too small boundaries for an argument or predicate of an OIE
extraction. A downstream application has to lter out or correct incorrect boundaries
which may cause a drastic recall loss. (2) Redundant Extraction appear if the
OIE system does not lter out these tuples. (3) Uninformative Extraction are
tuples without any reasonable value. This error type causes additional
processing e ort without delivering any value. (4)Missing Extraction describes relations
which were not found by a system. (5)Wrong Extraction are tuples emiting a
wrong information. It is not possible to recover from a error of this class and it
emits a wrong signal.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>System Design</title>
      <p>
        RelVis currently supports the following OIE Systems: Stanford OpenIE [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
OpenIE 4.2 [
        <xref ref-type="bibr" rid="ref2 ref5">2,5</xref>
        ], ClausIE [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and PredPat [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. To add a new OIE system,
the user can either implement a Java interface or upload results in RelVis' data
format. The system is compatible with four datasets, see Table 1, of which
two feature only binary relations with two arguments. Data sets NYT-nary and
OIE2016 also contain n-ary relations. These labeled data sets origin from [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. RelVis supports equal matching of boundaries in text to a gold standard. This
matching strategy delivers exact results for computing precision. However, this
strategy penalizes other, potentially correct, boundary de nitions beyond the
gold standard. Dealing with multiple OIE systems and their di erent annotation
styles requires a less restrictive matching strategy. As second strategy we focus
on a containment match. Here an argument or predicate is considered correct
if it at least contains a gold standard annotation, hence spans from the gold
standard may be contained (fully) inside the spans of the annotation from the
OIE system. However, this strategy may label over-speci c tuples as correct
and may lead to a lower precision. A containment strategy still penalizes binary
systems on n-ary data sets. Therefore we introduce as third strategy a relaxed
containment strategy which removes a penalty for wrong boundaries especially
for over speci c extractions. This strategy counts an extraction correct even
when the number of arguments doesn't match the gold standard.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>
        To our best knowledge, RelVis is the rst attempt integrating four di erent
OIE systems and four di erent data sets in a single comprehensive benchmark
system for OIE systems. It provides dashboards for in-depth qualitative
evaluations, classi es errors in ve common expendable classes and supports user
de ned annotations or data sets. In our future work we will obtain output
compatibility with BRAT annotations [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. RelVis enables the community exploring
existing and adding home grown OIE systems and is available as open source at
https://github.com/SchmaR/RelVis.
      </p>
      <p>Acknowledgement. Our work is funded by the German Federal Ministry of
Economic A airs and Energy (BMWi) under grant agreement 01MD16011E
(Medical Allround-Care Service Solutions), grant agreement 01MD15010B (Smart
Data Web) and H2020 ICT-2016-1 grant agreement 732328 (FashionBrain).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Angeli</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Premkumar</surname>
            ,
            <given-names>M.J.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , C.D.:
          <article-title>Leveraging linguistic structure for open domain information extraction</article-title>
          .
          <source>In: ACL</source>
          . pp.
          <volume>344</volume>
          {
          <issue>354</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Christensen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soderland</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etzioni</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <article-title>others: An analysis of open information extraction based on semantic role labeling. In: K-CAP</article-title>
          . pp.
          <volume>113</volume>
          {
          <fpage>120</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Corro</surname>
            ,
            <given-names>L.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gemulla</surname>
          </string-name>
          , R.:
          <article-title>Clausie: clause-based open information extraction</article-title>
          .
          <source>In: WWW</source>
          . pp.
          <volume>355</volume>
          {
          <issue>366</issue>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. Mausam:
          <article-title>Open information extraction systems and downstream applications</article-title>
          .
          <source>In: IJCAI</source>
          . pp.
          <volume>4074</volume>
          {
          <issue>4077</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Pal</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <article-title>Mausam: Demonyms and compound relational nouns in nominal open IE</article-title>
          . In: AKBC at NAACL
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>de Sa Mesquita</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidek</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barbosa</surname>
          </string-name>
          , D.:
          <article-title>E ectiveness and e ciency of open relation extraction</article-title>
          .
          <source>In: EMNLP</source>
          . pp.
          <volume>447</volume>
          {
          <issue>457</issue>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Schneider</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oberhauser</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klatt</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gers</surname>
            ,
            <given-names>A.F.</given-names>
          </string-name>
          , Loser, A.:
          <article-title>Analysing errors of open information extraction systems</article-title>
          .
          <source>In: BLGNLP at EMNLP</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Stanovsky</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dagan</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Creating a large benchmark for open information extraction</article-title>
          .
          <source>In: EMNLP</source>
          ,
          <year>2016</year>
          . pp.
          <volume>2300</volume>
          {
          <issue>2305</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Stanovsky</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dagan</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <article-title>Mausam: Open IE as an intermediate structure for semantic tasks</article-title>
          .
          <source>In: ACL</source>
          . pp.
          <volume>303</volume>
          {
          <issue>308</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Stenetorp</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pyysalo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Topic</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ohta</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ananiadou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsujii</surname>
          </string-name>
          , J.:
          <article-title>brat: a web-based tool for nlp-assisted text annotation</article-title>
          .
          <source>In: EACL</source>
          . pp.
          <volume>102</volume>
          {
          <issue>107</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>White</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reisinger</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sakaguchi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vieira</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudinger</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rawlins</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Durme</surname>
            ,
            <given-names>B.V.</given-names>
          </string-name>
          :
          <article-title>Universal decompositional semantics on universal dependencies</article-title>
          .
          <source>In: EMNLP</source>
          . pp.
          <volume>1713</volume>
          {
          <issue>1723</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>