<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Adjudication of Coreference Annotations via Finding Optimal Repairs of Equivalence Relations ?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Peter Schüller</string-name>
          <email>peter.schuller@marmara.edu.tr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Engineering Department Faculty of Engineering Marmara University</institution>
          ,
          <country country="TR">Turkey</country>
        </aff>
      </contrib-group>
      <fpage>57</fpage>
      <lpage>71</lpage>
      <abstract>
        <p>We describe encodings for merging multiple coreference annotations into a single annotation, subject to hard constraints (consistency) and optimization criteria (minimal divergence from annotators) using Answer Set Programming (ASP). This task requires guessing an equivalence relation with a large number of elements. We report on experiments with real-world instances based on the METU-Sabanci Turkish Treebank using the ASP tools Gringo, Clasp, and Wasp. We also describe a patch for Gringo which facilitates fine-grained instantiation analysis and use it to analyze experimental results.</p>
      </abstract>
      <kwd-group>
        <kwd>Coreference Resolution</kwd>
        <kwd>Adjudication</kwd>
        <kwd>Answer Set Programming</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Coreference Resolution [9, 14, 20, 21] is the task of finding phrases in a text that refer
to the same entity. Coreference is commonly annotated by marking subsequences of
tokens in the input text as mentions and putting sets of mentions into chains such that
all mentions in a chain refer to the same, clearly identifiable entity in the world.</p>
      <p>For example [22] in the text “ John is a musician. He played a new song. A girl was
listening to the song. ‘It is my favorite,’ John said to her.” we can identify the following
mentions.</p>
      <p>[John] (i) is [a musician] (ii). [He] (iii) played [a new song] (iv).
[A girl] (v) was listening to [the song] (vi).</p>
      <p>“[It] (vii) is [ [my] (ix) favorite] (viii),” [John] (x) said to [her] (xi).</p>
      <p>Roman superscripts denote mention IDs. This text contains the following chains: {(i),
(iii ), (ix ), (x)} (John, He, my, John); {(iv ), (vi ), (vii )} (a new song, the song, It); and
{(v), (xi )} (A girl, her), where numbers refer to mention IDs.1</p>
      <p>For supervised learning and verification of coreference resolution methods,
annotated corpora are an important resource. For creating such resources, we usually give
? This work has been supported by Scientific and Technological Research Council of Turkey
(TUBITAK) Grant 114E430.
1 Mention (viii ) is in a predicative relationship with (vii ). Per convention such mentions are
not considered coreferent, hence (viii ) is missing in the second chain.
the same document to several annotators who produce mention and chain annotations
independently. Afterwards we enter a process of manually adjudicating these—often
conflicting—annotations to produce a single gold standard. Annotations in coreference
resolution are global on the document level, i.e., it is not possible to decide the truth of
the annotation of one token, mention, or chain, without considering other tokens,
mentions, and chains in the same document. Merging such annotations is a tedious task,
and the final decision on the gold standard is done based on the adjudicators’ expert
opinion, however tool support for this task would be useful.</p>
      <p>We here present a method for automatic merging of coreference annotations based
on combinatorial optimization, where the result combines the information provided by
all annotators while performing minimal corrective modifications to ensure consistency
of the output. Our method is based on ASP optimization and provides various
parameters for defining which corrective modifications to prefer.</p>
      <p>We here focus on the experimental evaluation of two encodings for this task which
use different underlying representations for the equivalence relation:
– an encoding representing a transitive closure of mention-mention links, scoring
repairs based on the closure, and deriving a mention-chain representation from the
closure; and
– another encoding which directly represents mention-chain links and scores repairs
based on pairs of mention-chain links.</p>
      <p>We compare the Clasp [19] and Wasp [2] solver tools with branch-and-bound as well
as with unsatisfiable-core optimization modes on both encodings and several objective
function variations. Moreover we describe a patch to the Gringo [18] instantiation tool
that allows us to shed more light on the reasons for performance of encoding variants
by obtaining a detailed analysis of instantiation of nonground rules.</p>
      <p>In the following we provide preliminaries on Coreference Resolution and Answer
Set Programming in Section 2, describe our approach for automatic adjudication in
Section 3, provide ASP encodings in Section 4, discuss experimental evaluation and
instantiation analysis with Gringo in Section 5, and conclude with related and future
work in Section 6.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>We next introduce coreference resolution informally and mathematically, and then give
a brief introduction of Answer Set Programming.
2.1</p>
      <sec id="sec-2-1">
        <title>Coreference Resolution</title>
        <p>Coreference resolution is the task of finding phrases in a text that refer to the same
entity [9, 14, 20, 21, 26, 33]. We call such phrases mentions, and we call a group of
mentions that refers to one entity chains.</p>
        <p>In formal mathematical terms, we can describe mention detection and coreference
resolution as follows. Given a document D which is sequence of tokens w1, . . . , wn,
mention detection is the task of finding a setM = {(f1, t1), . . . , (fm, tm)} of mentions,
i.e., pairs (fi, ti) of indexes, 1 ≤ i ≤ m, into the sequence of tokens (1 ≤ fi ≤ ti ≤ n).
Given a set M of mentions, coreference resolution is the task of partitioning M into
k (disjoint) partitions (chains) P such that all sequences of tokens wfi , . . . , wti
corresponding to mention (fi, ti) in one chain refer to the same entity.</p>
        <p>
          Example 1 (Introduction example continued). We have 31 tokens (including
punctuation). Some of the tokens are w1 = ‘John’, w2 = ‘is’, . . . , w30 = ‘her’, w31 = ‘.’, the
set of mentions is M = {(
          <xref ref-type="bibr" rid="ref1 ref1">1, 1</xref>
          ), (
          <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
          ), (
          <xref ref-type="bibr" rid="ref6 ref6">6, 6</xref>
          ), . . ., (
          <xref ref-type="bibr" rid="ref30 ref30">30, 30</xref>
          )} and the correct coreference
partitions are P = {{(
          <xref ref-type="bibr" rid="ref1 ref1">1, 1</xref>
          ), (
          <xref ref-type="bibr" rid="ref6 ref6">6, 6</xref>
          ), (
          <xref ref-type="bibr" rid="ref23 ref23">23, 23</xref>
          ), (
          <xref ref-type="bibr" rid="ref27 ref27">27, 27</xref>
          )}, . . . , {(
          <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
          ), (
          <xref ref-type="bibr" rid="ref30 ref30">30, 30</xref>
          )}} where,
(
          <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
          ) represents ‘A girl’ and (
          <xref ref-type="bibr" rid="ref30 ref30">30, 30</xref>
          ) represents ‘her’.
tu
        </p>
        <p>Given a set of mentions, there are exponentially many potential solutions to the
coreference resolution problem, and finding a globally optimal solution is NP-hard
according to most measures of coreference optimality [31]. In coreference annotation,
humans create both mentions and chains, so the problem becomes more involved and
automatic methods for creating consistent result annotations are helpful.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Answer Set Programming</title>
        <p>Answer Set Programming (ASP) is a logic programming paradigm which is suitable for
knowledge representation and finding solutions for computationally (NP-)hard
problems [5, 6, 17, 23]. An ASP program consists of rules of the form</p>
        <p>Head ← Body .
where Head is either a first-order atom or a choice construction, andBody is a
conjunction of first-order atoms. Intuitively, a rule makes the head logically true if the body is
satisfied. Rules without body arefacts (the head is always true) and rules without head
are constraints (if the body is satisfied, the candidate solution is discarded). Choices of
the form L {A1; . . . ; An} U generate all solution candidates where between L and U
atoms from the set {A1, . . . , An} are true.</p>
        <p>In addition to combinatorial search, by using weak constraints of the form
B;ody .
we can perform combinatorial optimization: a weak constraint incurs cost Cost for
each unique tuple Tuple such that Body is satisfied in the answer set with respect to
that tuple.</p>
        <p>For details of syntax and semantics we refer to the ASP-Core-2 standard [7].
Efficient solvers are available for computing answer sets, in this work we use the Gringo
4.5.0 [18] grounder and experiment with the Clasp 3.1.2 [19] and Wasp 2.0 [2] solvers.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Automatic Coreference Adjudication</title>
      <p>We first describe the problem more formally and give an example of inconsistent
annotations, then we describe how we approach the problem and its input and output in</p>
      <p>ASP. Finally we provide and describe two logic programs that use different internal
representations.</p>
      <p>Mathematically, we can describe adjudication of multiple coreference resolution
annotations as follows.</p>
      <p>Given a document D of tokens as described in Section 2.1, and u ≥ 2 partitions
P1, . . . , Pu of mentions (where each partition can be based on a different
underlying set of mentions) we need to create a single set of mentions M and a
single partition P which groups the set M into chains.</p>
      <p>Clearly, annotations might be contradictory and we need to ensure certain structural
constraints in the solution. For example mention A cannot be coreferent with mention
B if A includes B. Moreover, a solution might contain equivalences between mentions
that are not present in any annotation. This is because chains are equivalence relations,
and if we merge equivalence relations that are not subsets of each other, the new
equivalence relation is the reflexive, symmetric, and transitive closure of the original relations.
Example 2 (continued). Assume that in parallel to chain {(i ), (iii ), (ix ), (x )} in the
Introduction, we obtain (from another annotator) a chain {(i ), (ii ), (x )} If we merge these
chains naively by merging the sets, we obtain a single chain {(i ), (ii ), (iii ), (ix ), (x )}
although no annotator indicated that (ii ) and (iii ) belong to the same chain.
tu</p>
      <p>Therefore we need to merge chains in a more intelligent way.
3.1</p>
      <sec id="sec-3-1">
        <title>Merge Representation and Merge Preferences</title>
        <p>We want to use as much information from annotations as possible, while keeping
consistency of the solution. We achieve this by attaching a cost to ignored mentions, to
ignored links between mentions, and to links that were not annotated but arise due to
merged chains.</p>
        <p>Our conceptual model for merging annotations from multiple sources is as follows.
– A subset of annotated chain-mention links is selected from all annotators.
– From the resulting chains, each chain is represented as a set of links between all
mentions it contains (i.e., as a symmetric, transitive relation).
– A subset of these links is selected to build the combined result.
– The transitive, and reflexive closure of these selected links is created, which
provides an equivalence relation over mentions of all annotators.
– We use this equivalence relation as new set of chains.
– We ensure the solution adheres to two constraints: no mention can be a part of itself,
and each chain must contain more than one mention.</p>
        <p>We search for the choice of chain-mention and mention-mention links that
optimizes an objective function. As objective functions we incur a cost of
– costm (M ) for each ignored mention M in a chain;
– costl (M , M 0) for each ignored link between mentions M and M 0; and
– costa (M , M 0) for each link between mentions M and M 0 that is present in the
result equivalence relation but not part of any annotation.</p>
        <p>In Section 4.3 we provide concrete preference configurations of these functions.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2 Input and Output ASP Representation</title>
        <p>Given u partitions P1, . . . , Pu where each chain C ∈ Pi contains a set of mentions of
form (from, to) we create facts of form mention(A, M , Fr , To) where A represents
the partition (i.e., which annotator created the annotation), M is a unique identifier for
each mention, and (Fr , To) represents the mention. Moreover we create facts of form
cm(A, Chain, M ) for each mention with identifierM that is in a chain with identifier
Chain according to annotator A.</p>
        <p>Example 3 (continued). Given the text from our running example, we can mark tokens
(including punctuation) with integers as follows.</p>
        <p>John1 is2 a3 musician4 .5 He6 played7 a8 new9 song10 .11</p>
        <p>Consider that annotator a1 created chain {‘John’a, ‘a musician’b, ‘He’c} while
annotator a2 produced {‘musician’d, ‘He’e} where superscripts indicate mention IDs.
These annotations would be represented by the following ASP facts.</p>
        <p>mention(a1 , a, 1 , 1 ) ← . cm(a1 , c1 , a) ← .
mention(a1 , b, 3 , 4 ) ← . cm(a1 , c1 , b) ← .
mention(a1 , c, 6 , 6 ) ← . cm(a1 , c1 , c) ← .
mention(a2 , d , 4 , 4 ) ← . cm(a2 , c2 , d ) ← .</p>
        <p>mention(a2 , e, 6 , 6 ) ← . cm(a2 , c2 , e) ← .
where c1 and c2 represent the first and second chain, respectively.</p>
        <p>
          The output of our logic program will be a set of chains without annotator
information. Therefore we want to obtain atoms of the form resultcm (Chain, mid (Fr , To))
which indicate that in chain Chain there is a mention from token F r to token T o.
Example 4 (continued). Assume we have merged the annotations of the previous
example into two chains v = {‘John’, ‘a musician’} and w = {‘musician’, ‘He’}, this
would be represented by the following atoms:
resultcm (v, mid (
          <xref ref-type="bibr" rid="ref1 ref1">1, 1</xref>
          ))
resultcm (w, mid (
          <xref ref-type="bibr" rid="ref4 ref4">4, 4</xref>
          ))
resultcm (v, mid (
          <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
          ))
resultcm (w, mid (
          <xref ref-type="bibr" rid="ref6 ref6">6, 6</xref>
          ))
Note that it is frequently the case that mentions contain other mentions, but this is
usually forbidden in case these mentions belong to the same chain.
tu
tu
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>ASP Encodings</title>
      <p>We next provide two ASP encodings that realize the above merging strategy. The first
encoding explicitly represents the transitive closure of mention-mention links, while
the second encoding avoids such a representation by representing the resulting
equivalence relation only as chain-mention links. We use functions cost X for configuring the
preference relation; these are part of the encoding and resolved during instantiation.
{ usecm (A, Chain, M ) : cm(A, Chain, M ) } ← .</p>
      <p>;cm(A, C , M ), not usecm (A, C , M ).
link (A, M1 , M2 ) ← usecm (A, C , M1 ), usecm (A, C , M2 ), M1 &lt; M2 .
{ uselink (A, M1 , M2 ) } ← link (A, M1 , M2 ).</p>
      <p>
        ;link (A, M1 , M2 ), not uselink (A, M1 , M2 ). [costl (M1 , M2 )@1 , A, M1 , M2 , mm] (
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
clink (X , Y ) ← X = mid (Fr1 , To1 ), Y = mid (Fr2 , To2 ), X &lt; Y ,
      </p>
      <p>uselink (A, M1 , M2 ), mention(A, M1 , Fr1 , To1 ), mention(A, M2 , Fr2 , To2 ).
clink (X , Y ) ← X = mid (Fr1 , To1 ), Y = mid (Fr2 , To2 ), X &lt; Y ,</p>
      <p>
        uselink (A, M2 , M1 ), mention(A, M1 , Fr1 , To1 ), mention(A, M2 , Fr2 , To2 ).
;clink →(X , Y ), X &lt; Y , not clink (X , Y ). [cost a(X, Y )@1, X, Y, add]
clink →(X , Y ) ← clink (X , Y ).
clink →(X , Y ) ← clink →(Y , X ).
clink →(X , Z ) ← clink →(X , Y ), clink →(Y , Z ).
not repr (Y ) ← clink →(X, Y ), X &lt; Y.
resultc,m (X , Y ) ← clink →(X , Y ), not notrepr (X ).
resultc(C ) ← resultc,m (C , _).
← resultc(C ), #count { Y : resultc,m (C , Y ) } ≤ 1 .
← clink (mid (F1 , T1 ), mid (F2 , T2 )),F1 ≤ F2 , T2 ≤ T1 ,(F1 , T1 ) 6= (F2 , T2 ).
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
(
        <xref ref-type="bibr" rid="ref8">8</xref>
        )
(
        <xref ref-type="bibr" rid="ref9">9</xref>
        )
(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )
(
        <xref ref-type="bibr" rid="ref11">11</xref>
        )
(
        <xref ref-type="bibr" rid="ref12">12</xref>
        )
(
        <xref ref-type="bibr" rid="ref13">13</xref>
        )
(
        <xref ref-type="bibr" rid="ref14">14</xref>
        )
(
        <xref ref-type="bibr" rid="ref15">15</xref>
        )
(
        <xref ref-type="bibr" rid="ref16">16</xref>
        )
ann(A) ← cm(A, _, _).
count _chain(A, N ) ← ann(A), N = #count { C : cm(A, C , _) }.
maxchain(N ∗ 3 /2 ) ← N = #max { C : count _chain(_, C ) }.
{ resultc(X ) : X = 1 ..Max } ← maxhain(Max ).
resultc(X −1 ) ← resultc(X ), 1 &lt; X .
cmention(M ) ← clink (M , _).
cmention(M ) ← clink (_, M ).
1 { resultcm (C , M ) : resultc(C ) } 1 ← cmention(M ).
← clink (X , Y ), resultcm (C , X ), not resultcm (C , Y ).
← clink (X , Y ), not resultcm (C , X ), resultcm (C , Y ).
      </p>
      <p>
        ;resultcm (C , X ), resultcm (C , Y ), X &lt; Y , not clink (X , Y ).
(
        <xref ref-type="bibr" rid="ref17">17</xref>
        )
(
        <xref ref-type="bibr" rid="ref18">18</xref>
        )
(
        <xref ref-type="bibr" rid="ref19">19</xref>
        )
(
        <xref ref-type="bibr" rid="ref20">20</xref>
        )
(
        <xref ref-type="bibr" rid="ref21">21</xref>
        )
(
        <xref ref-type="bibr" rid="ref22">22</xref>
        )
(
        <xref ref-type="bibr" rid="ref23">23</xref>
        )
(
        <xref ref-type="bibr" rid="ref24">24</xref>
        )
(
        <xref ref-type="bibr" rid="ref25">25</xref>
        )
(
        <xref ref-type="bibr" rid="ref26">26</xref>
        )
(
        <xref ref-type="bibr" rid="ref27">27</xref>
        )
      </p>
      <p>
        Based on clink →, weak constraint (
        <xref ref-type="bibr" rid="ref11">11</xref>
        ) incurs cost cost a(M, M 0) for all pairs of
mentions that are present in the result equivalence relation but not in any
mentionmention link obtained from annotators (clink ).
      </p>
      <p>
        The equivalence relation clink → is transformed back into a chain-mention
representation in (
        <xref ref-type="bibr" rid="ref13">13</xref>
        ) and (
        <xref ref-type="bibr" rid="ref14">14</xref>
        ) by using the lexicographically smallest mention as representative
for each chain, where (
        <xref ref-type="bibr" rid="ref12">12</xref>
        ) identifies lexicographically non-smallest mentions.
      </p>
      <p>
        Finally we impose two structural constraints on the result: a chain has to contain
more than one mention (
        <xref ref-type="bibr" rid="ref15">15</xref>
        ) and a mention that is within another mention cannot be in
the same chain with that mention (
        <xref ref-type="bibr" rid="ref16">16</xref>
        ). Note that, thanks to the elaboration tolerance [25]
of ASP, constraints (
        <xref ref-type="bibr" rid="ref15">15</xref>
        ) and (
        <xref ref-type="bibr" rid="ref16">16</xref>
        ) can be removed if the respective restriction should
not be applied, or can be made more restrictive (e.g., mentions in a chain cannot be
neighbors) independent from the rest of the encoding.
      </p>
      <p>
        The encoding shown in Figure 1 admits as answer sets all possible consistent merges
of the given annotations, including solutions where all annotations are ignored. Weak
constraints prefer solutions where, according to the chosen cost functions, as many
given annotations as possible are used and a single consistent solution is represented
in the answer set. The balance between not ignoring annotations and not creating too
many novel links between mentions can be adjusted using the cost functions.
The difference between MM and CM is the way we represent the equivalence relation
and how we check the preference function, in particular links in the solution that are
not present in annotations. For reasons of brevity, we provide this encoding as a patch
to encoding MM: encoding CM is obtained by using rules (
        <xref ref-type="bibr" rid="ref17">17</xref>
        )–(
        <xref ref-type="bibr" rid="ref27">27</xref>
        ) shown in Figure 2
instead of rules (
        <xref ref-type="bibr" rid="ref8">8</xref>
        )–(
        <xref ref-type="bibr" rid="ref11">11</xref>
        ) in encoding MM.
      </p>
      <p>
        In the CM encoding variant, (
        <xref ref-type="bibr" rid="ref17">17</xref>
        ) represents annotators, (
        <xref ref-type="bibr" rid="ref18">18</xref>
        ) finds the number of
chains for each annotator, and (
        <xref ref-type="bibr" rid="ref19">19</xref>
        ) computes 1.5 times the maximum of the largest
number of chains annotated by any annotator in the input. Encoding CM is based on
the assumption that the preferred result will not contain more chains than this amount
of chains. This assumption can safely be made because differences in the amount of
annotations are consistent over the whole document: some annotators produce more
annotations, but this is true over the whole document, therefore using the maximum
amount of chains times 1.5 is safe (using the preference relations we study, this number
is never reached in practical solutions).
      </p>
      <p>
        Given the maximum number of resulting chains, (
        <xref ref-type="bibr" rid="ref20">20</xref>
        ) guesses which chain will
actually exist, and (
        <xref ref-type="bibr" rid="ref21">21</xref>
        ) ensures that if we use a certain chain ID, then we also use the
ID below (this eliminates symmetries by ensuring the usage of consecutive chain IDs).
Rules (
        <xref ref-type="bibr" rid="ref22">22</xref>
        ) and (
        <xref ref-type="bibr" rid="ref23">23</xref>
        ) represent all canonical mentions in cmention(·). For each of these
mentions, (
        <xref ref-type="bibr" rid="ref24">24</xref>
        ) guesses in which chain that mention will be in the result. Constraints (
        <xref ref-type="bibr" rid="ref25">25</xref>
        )
and (
        <xref ref-type="bibr" rid="ref26">26</xref>
        ) enforce, that the guessed solution in resultcm corresponds to those links of
annotators that have been selected to be part of the solution in clink .
      </p>
      <p>
        To incur a cost for transitive edges that are not present in annotations, we cannot use
clink → as in encoding MM. Instead, in weak constraint (
        <xref ref-type="bibr" rid="ref27">27</xref>
        ) we again join the resultcm
relation with itself on the chain ID.
4.3
      </p>
      <sec id="sec-4-1">
        <title>Weak and Strong Constraints for Preferences</title>
        <p>The encodings we have described so far impose a cost on chain/mention and
mention/mention annotations that are not reflected in the solution, and a cost on
mention/mention annotations in the solution that were not annotated by human annotators.</p>
        <p>If we replace the weak constraints by strict constraints (and remove the attached
cost) then we obtain solutions corresponding to infinite cost of a constraint violation,
i.e., the constraints must never be violated. Note that, if we replace all weak constraints
by strict constraints, inconsistent annotations will not admit a solution. To guarantee
existence of a solution, we must allow either to ignore chain/mention annotations, i.e.,
costm (M ) must be finite, or to ignore mention/mention annotations, i.e.,costl (M, M 0)
must be finite.</p>
        <p>Table 3 shows all objective function configurations we experiment with in this work.
A dash ‘-’ indicates that the constraint was used as a strict constraint, an integer
indicates a constant cost per ignored annotation, while L indicates a cost depending on the
length |M | of mention M , computed as follows.</p>
        <p>cost m(M ) =
 3 if |M | = 1
</p>
        <p>2 if 2 ≤ |M | ≤ 4
 1 otherwise
cost l(M, M 0) =
 3 if |M | + |M 0| = 2
</p>
        <p>2 if 3 ≤ |M | + |M 0| ≤ 8
 1 otherwise</p>
        <p>Some rationale for the choice of objectives in Table 3 is as follows. Annotators
make more mistakes with longer mentions, and for this reason it can be useful to assign
a higher cost to ignoring mentions that contain fewer tokens. Therefore cost functions</p>
        <p>Encoding/Approximation OOM
#</p>
        <p>OOT SAT
# #</p>
        <p>OPT
#</p>
        <p>T
sec</p>
        <p>M
MB
CM/25
CM/∞
MM/25
MM/∞
can be constant or depend on the length of the involved mentions. We can ensure
transitivity by enforcing that we never merge chains that would require additional
mentionmention links, however this would cause more mentions or their links to be ignored, and
we would like to believe that human annotators create annotations for a reason.
Therefore we put a cost on additional links that is lower than the cost of ignoring a given
annotation (in some cases the cost is even zero). Moreover ignoring a chain-mention
link implicitly removes at least one mention-mention link, and in large chains removes
many mention-mention links. Therefore we put higher cost on ignoreing chain-mention
links than on ignoring mention-mention links. A more comprehensive analysis of the
implications of cost functions in Table 3 for the practical usefulness of solutions is a
topic of future work. In this work our purpose is to provide a configurable framework
and study the influence on the cost function on feasibility of computation.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experimental Evaluation</title>
      <p>To evaluate our approach we used 21 documents from the METU-Sabanci Turkish
Treebank [29] where documents have between 494 and 2600 mentions (1199 on average)
and between 247 and 1300 chains (600 on average) with each chain containing between
19 and 53 mentions (34 on average). Each document was annotated by between 6 and
9 annotators (8 on average).</p>
      <p>Experiments were performed on a computer with 48 GB RAM and two Intel
E52630 CPUs (total 16 cores) using Debian 8 and at most 8 concurrently running jobs
(each job using a single core, as we did not use solvers in multithreaded mode). We
limited memory usage to 5 GB and time usage to 300 sec. For instantiation we used Gringo
4.5.0 [18], for solving Clasp 3.1.2 [19] and Wasp 2.0 [2]. We use Clasp with
parameters -opt-strategy=bb and -opt-strategy=usc and Wasp with parameters
-weakconstraints-algorithm=basic and without arguments (which
corresponds to unsat-core based optimization).</p>
      <p>As our instances are very hard for the abovementioned memory and time
constraints, we additionally experiment with an approximation for the preference relation:
we only use a subset of weak constraints, concretely we omit weak constraints (for
ignoring or adding mention-mention links) whenever the pair of mentions has a distance
of more than 25 tokens in the document.</p>
      <p>In our experimental results, columns OOM, OOT, SAT, and OPT, shows the number
of runs that exceeded 5 GB, exceeded 300 sec, found at least one satisfiable solution,</p>
      <p>Encoding/Solver OOM OOT SAT OPT T M Tgrd
# # # # sec MB sec</p>
      <p>Chc Conf Lem
# # #
CM/Clasp/B
CM/Wasp/B
CM/Clasp/U
CM/Wasp/U
MM/Clasp/B
MM/Wasp/B
MM/Clasp/U
MM/Wasp/U
and found the first optimal solution and proved its optimality, respectively. Moreover
columns T , M , and Tgrd give the average total time, memory, and instantiation time of
all runs. In our experiments, all OOM and OOT conditions were reached during solving,
never during grounding.</p>
      <p>For comparing the feasibility of computing solutions with MM and CM encodings
and with or without approximation, Table 1 depicts accumulated results over all 21
documents, 4 solver variations, and 12 objective functions (i.e., each table row
summarizes 1008 runs). We can observe that encoding MM exhausts memory much more than
encoding CM, and that we can only fit into 5 GB by using CM and the 25-token
approximation. Although CM fits into less memory, it requires more time on average and we
find more optimal solutions with encoding MM, both with and without the 25-token
approximation. Instantiation times are similar for all configurations, except for the CM/25
combination which requires half the time of the other configurations for reasons we try
to explain in Section 5.1.</p>
      <p>Table 2 compares computational methods for both encodings with 25-token
approximation across all 21 instances and 12 objective functions. We see that for getting
some answer set (SAT or OPT), bound-based optimization and CM encoding are better
than configurations with unsat-core optimization or encoding MM, i.e., CM/·/B
provides (suboptimal) solutions for all instances. However, the highest number of optimal
solutions is obtained with encoding MM and unsat-core optimization: here Clasp
performs better (183 optimal solutions) than Wasp (151) in this setting. The columns
Chc, Conf , and Lem, show the number of choices, conflicts, and learned clauses,
respectively (in runs that did not exhaust memory). Clasp did not find any satisfiable
solutions in unsat-core (U) mode (either the solutions were optimal or the timeout was
reached), while Wasp found many solutions before optimality. On the other hand, in
branch-and-bound mode (B) Clasp finds more optimal and more satisfiable solutions
than Wasp. The memory usage of Wasp is significantly higher than the one ofClasp
in mode B, while for CM/·/U this difference is smaller (248 MB vs 305 MB).</p>
      <p>In summary, for branch-and-bound optimization Clasp seems more suitable, while
for unsat-core optimization, Wasp seems more suitable.</p>
      <p>As only the CM encoding with 25-token approximation fits into memory for all
configurations, and because only branch-and-bound optimization finds solutions in all
cost m</p>
      <p>Objective
cost l cost a</p>
      <p>Clasp/B
SAT OPT T M
# # sec MB</p>
      <p>Wasp/B
SAT OPT T
# # sec
cases, we compared several possible objective functions using the CM/25 configuration.
Table 3 shows the result, both with Clasp and Wasp in branch-and-bound
optimization mode. We see that there are no big differences between the feasibility of various
objective functions: only few instances are solved to the optimal solution and therefore
average time requirements are very close to the timeout of 300 seconds. However, we
can observe interesting memory requirement, which is high for Wasp in those cases
where it is low for Clasp: for an objective function which does not permit ignoring
mention-mention links, uses fixed cost 2 on ignoring chain-mention annotations, and
either uses cost 1 or no cost for additional mention-mention links. Moreover, memory
requirement is more stable for Clasp while it varies more and is higher in general for
Wasp.</p>
      <sec id="sec-5-1">
        <title>5.1 Instantiation Analysis with Gringo</title>
        <p>To gain additional insights about the instantiated program, we have modified Gringo
4.5.0 to count instantiations of each nonground rule and extend the output of gringo
-verbose with a table that shows for each nonground rule the number of times it was
instantiated with 0 body conditions (i.e., as facts), with 1 and 2 body conditions (i.e.,
as rules that can be propagated particularly efficiently), and as rules with 3 or more
body conditions. Our publicly available patch3 shows these counts for the intermediate
representation of Gringo.</p>
        <p>Table 4 shows those parts of the instantiation analysis table where rules have been
instantiated more than 10k times for an instance of average difficulty (1298 mentions,
649 chains, max 35 mentions per chain).</p>
        <p>
          In the MM encoding, the main instantiation effort (725k rules) originates in the
transitive closure rules (
          <xref ref-type="bibr" rid="ref10">10</xref>
          ). On the other hand, in the CM encoding the main instantiation
3 https://github.com/peschue/clingo/tree/grounder-stats
        </p>
        <p>Variables in body</p>
        <p>Nonground Rule
0</p>
        <p>1
0 12K 0
0 0 725K
0 453 12K
0 12K 0
2
2
0
0 0 112K :-result_cm(C,X),clink(X,Y),not result_cm(C,Y).
0 0 112K :-result_cm(C,Y),clink(X,Y),not result_cm(C,X).
0 5212K 112K :~result_cm(C,Y),X&lt;Y,result_cm(C,X),</p>
        <p>
          not clink(X,Y).
effort (5212k rules) is the weak constraint that incurs a cost on mention-mention links
which are not present in any annotation (
          <xref ref-type="bibr" rid="ref27">27</xref>
          ). All other rules are instantiated less than
10k times, so the shown rules dominate over other rules in both cases.
        </p>
        <p>Two interesting things become apparent when relating the times in Table 2 and the
numbers in Table 4: although encoding CM contains significantly more instantiated
rules than MM (5M compared with 800K) (a) encoding CM can be instantiated much
faster than MM (4.1 sec compared with 11.1 sec), moreover (b) encoding CM requires
less memory and produces fewer OOM conditions than MM.</p>
        <p>A significant difference in the structure of CM and MM encodings is, that CM is
tight [15], i.e., it requires no unfounded-loop check and can be solved by solving a SAT
instance, while encoding MM is not tight due to the transitive closure rule.</p>
        <p>As a result of these observations, it appears as if loop nogoods produced by solvers
in encoding MM cause the observed OOM conditions and the increased memory
footprint. Still, encoding MM permits to find more optimal solutions, which indicates that
— if there is sufficient memory available — the non-tight encoding MM is to be
preferred over the tight encoding CM.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>We have developed a method for automatic merging of coreference resolution
annotations, with the objective of using as much information from annotators as possible while
maintaining consistency of the resulting annotation.</p>
      <p>In practice, adjudication cannot be done automatically, therefore our approach
allows to enforce a part of the resulting mentions and chains, while optimizing a solution
over the remaining tokens that were not (yet) specified by the user.</p>
      <p>We have created a tool based on our encodings and based on the popular CoNLL
format for coreference resolution and the format used by the CorScorer [27] tool. We
are currently using this method to prepare a corpus for Turkish coreference annotations
and plan to release the tool as open source software in the future.
6.1</p>
      <sec id="sec-6-1">
        <title>Related and Future Work</title>
        <p>ASP encodings for transitivity are present in many applications. In particular ASP
encodings for acyclicity properties have been studied by Gebser et al. [16] including
transitive closure as part of some encodings. Similar as in our experiments, tightness makes
a relevant difference. Different from Gebser et al. we consider optimization problems
and compare different ASP solvers and different optimization algorithsm.</p>
        <p>Finding minimal repairs for inconsistent annotations is related to finding minimal
repairs for databases [8] or ontologies [13], and to inconsistency management in
distributed knowledge based systems [12]. In these related applications, the aim is to find
a minimal change in the system that make it globally consistent, as in the merging of
coreference annotations. All these applications have in common, that a change which
fixes one inconsistency might introduce another one, as in the case of coreference
annotations, where adding a mention-mention link can implicitly create many other invalid
mention-mention links, and removing a mention-mention link can require us to remove
many mention-mention links that have been annotated.</p>
        <p>Maximizing the usage of annotations provided by annotators based on
mentionmention links is similar to the MUC [32], B3 [4], and BLANC [28] evaluation metrics
for coreference analysis which are based on mention-mention links. In particular the
most recently proposed BLANC metric puts an emphasis on including non-existing
mention-mention links between gold standard and system output into the score, which
is related to our constraints on mention-mention links that are not present in
annotations. More remotely related to our work is the CEAF [24] metric which is based on
finding an optimal bipartite graph matching between system output and gold
annotations, and identifying percentage of either correct chains or correct mentions. Realizing
an optimization criterion based on this idea in our work would be an interesting topic
for future work.</p>
        <p>
          Denis and Baldridge described an approach for coreference resolution and named
entity classification based on Integer Linear Programming [10], which is a formalism
related to Answer Set Programming. Their approach is not aimed at merging
annotations and does not support a semi-automatic operation, however they include in their
formulation a similar constraint on transitivity of mention-mention links as we have in
rules (
          <xref ref-type="bibr" rid="ref9">9</xref>
          ) and (
          <xref ref-type="bibr" rid="ref10">10</xref>
          ).
        </p>
        <p>Our current encodings are not yet feasible for large inputs which occur in practice,
as shown in our experiments. In many cases the optimal solution is not found and only
with approximation of the objective function we can find some solution for all instances
within 5 minutes. In future work we will study the impact of optimality on the usability
of a solution, and we will investigate the choice of preference function for making
our method useful for adjudication in practice. To increase performance, we plan to
apply recently developed solver versions that support stratification [3] and
unsat-coreshrinking [1] for providing suboptimal answer sets while using unsat-core methods,
moreover we plan to apply methods for lazy instantiation of constraints [11, 30].</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Alviano</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dodaro</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Anytime answer set optimization via unsatisfiable core shrinking</article-title>
          .
          <source>Theory and Practice of Logic</source>
          Programming pp.
          <source>ICLP</source>
          <year>2016</year>
          , In press, arXiv:
          <fpage>1608</fpage>
          .00731 [cs.LO] (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Alviano</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dodaro</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leone</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ricca</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Advances in WASP</article-title>
          .
          <source>In: International Conference on Logic Programming and Non-monotonic Reasoning (LPNMR)</source>
          . pp.
          <fpage>40</fpage>
          -
          <lpage>54</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ansótegui</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonet</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Levy</surname>
          </string-name>
          , J.:
          <article-title>SAT-based MaxSAT algorithms</article-title>
          .
          <source>Artificial Intelligence</source>
          <volume>196</volume>
          ,
          <fpage>77</fpage>
          -
          <lpage>105</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bagga</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baldwin</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Algorithms for Scoring Coreference Chains Shortcomings of the MUC-6 Algorithm</article-title>
          . In: International conference on
          <source>language resources and evaluation workshop on linguistics coreference</source>
          . pp.
          <fpage>563</fpage>
          -
          <lpage>566</lpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Baral</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Knowledge Representation, Reasoning, and Declarative Problem Solving</article-title>
          . Cambridge University Press (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Brewka</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eiter</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Truszczynski</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Answer set programming at a glance</article-title>
          .
          <source>Communications of the ACM</source>
          <volume>54</volume>
          (
          <issue>12</issue>
          ) (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Calimeri</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faber</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gebser</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ianni</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaminski</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krennwallner</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leone</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ricca</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schaub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>ASP-Core-2 Input language format</article-title>
          .
          <source>Tech. rep.</source>
          , ASP Standardization Working Group (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Chomicki</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marcinkowski</surname>
          </string-name>
          , J.:
          <article-title>Minimal-change integrity maintenance using tuple deletions</article-title>
          .
          <source>Information and Computation</source>
          <volume>197</volume>
          (
          <issue>1</issue>
          ),
          <fpage>90</fpage>
          -
          <lpage>121</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>González-Brenes</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          : Coreference Resolution: Current Trends and
          <string-name>
            <given-names>Future</given-names>
            <surname>Directions</surname>
          </string-name>
          . Language and
          <string-name>
            <surname>Statistics II Literature Review</surname>
          </string-name>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Denis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baldridge</surname>
          </string-name>
          , J.:
          <article-title>Global joint models for coreference resolution and named entity classification</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>42</volume>
          ,
          <fpage>87</fpage>
          -
          <lpage>96</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Dodaro</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ricca</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schüller</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>External Propagators in WASP: Preliminary report</article-title>
          . In: International Workshop on
          <article-title>Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion (RCRA) (</article-title>
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Eiter</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fink</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schüller</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weinzierl</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Finding Explanations of Inconsistency in Multi-Context Systems</article-title>
          .
          <source>Artificial Intelligence</source>
          <volume>216</volume>
          ,
          <fpage>233</fpage>
          -
          <lpage>274</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Eiter</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fink</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stepanova</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Towards Practical Deletion Repair of Inconsistent DLprograms</article-title>
          .
          <source>In: European Conference on Artificial Intelligence</source>
          . pp.
          <fpage>285</fpage>
          -
          <lpage>290</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Elango</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Coreference Resolution:
          <string-name>
            <given-names>A</given-names>
            <surname>Survey</surname>
          </string-name>
          .
          <source>Tech. rep.</source>
          , University of Wisconsin, Madison (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Erdem</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lifschitz</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Tight logic programs</article-title>
          .
          <source>Theory and Practice of Logic Programming</source>
          <volume>3</volume>
          (
          <issue>4-5</issue>
          ),
          <fpage>499</fpage>
          -
          <lpage>518</lpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Gebser</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Janhunen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rintanen</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>ASP Encodings of Acyclicity Properties</article-title>
          .
          <source>Journal of Logic and Computation</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Gebser</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaminski</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , Kaufmann,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Schaub</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          :
          <article-title>Answer Set Solving in Practice</article-title>
          . Morgan Claypool (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Gebser</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaminski</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>König</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schaub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Advances in gringo series 3</article-title>
          . In: International Conference on
          <article-title>Logic Programming and Non-monotonic Reasoning (LPNMR)</article-title>
          . pp.
          <fpage>345</fpage>
          -
          <lpage>351</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Gebser</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaufmann</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schaub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Conflict-driven answer set solving: From theory to practice</article-title>
          .
          <source>Artificial Intelligence 187-188</source>
          ,
          <fpage>52</fpage>
          -
          <lpage>89</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Hirst</surname>
          </string-name>
          , G.:
          <article-title>Anaphora in Natural Language Understanding: A Survey</article-title>
          . Springer-Verlag (
          <year>1981</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Kehler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kertz</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rohde</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elman</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          :
          <article-title>Coherence and Coreference Revisited</article-title>
          .
          <source>Journal of Semantics</source>
          <volume>25</volume>
          (
          <issue>1</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>44</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peirsman</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chambers</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Surdeanu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dan</surname>
            <given-names>Jurafsky</given-names>
          </string-name>
          :
          <article-title>Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules</article-title>
          .
          <source>Computational Linguistics</source>
          <volume>39</volume>
          (
          <issue>4</issue>
          ),
          <fpage>885</fpage>
          -
          <lpage>916</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Lifschitz</surname>
          </string-name>
          , V.:
          <article-title>What Is Answer Set Programming?</article-title>
          <source>In: AAAI Conference on Artificial Intelligence</source>
          . pp.
          <fpage>1594</fpage>
          -
          <lpage>1597</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>On coreference resolution performance metrics</article-title>
          .
          <source>In: Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT/EMNLP)</source>
          . pp.
          <fpage>25</fpage>
          -
          <lpage>32</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>McCarthy</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Elaboration tolerance</article-title>
          .
          <source>In: Common Sense</source>
          . pp.
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Mitkov</surname>
          </string-name>
          , R.:
          <article-title>Anaphora Resolution: the State of the Art</article-title>
          .
          <source>School of Languages and European Studies</source>
          , University of Wolverhampton (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Pradhan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Recasens</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hovy</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Strube</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation</article-title>
          . In:
          <article-title>Annual Meeting of the Association for Computational Linguistics (ACL)</article-title>
          .
          <source>vol. 2</source>
          , pp.
          <fpage>30</fpage>
          -
          <lpage>35</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Recasens</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hovy</surname>
          </string-name>
          , E.:
          <article-title>BLANC: Implementing the Rand Index for Coreference Evaluation</article-title>
          .
          <source>Natural Language Engineering</source>
          <volume>17</volume>
          (
          <issue>4</issue>
          ),
          <fpage>485</fpage>
          -
          <lpage>510</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Say</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zeyrek</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oflazer</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Özge</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Development of a Corpus and a Treebank for Present Day Written Turkish</article-title>
          .
          <source>In: (Proceedings of the Eleventh International Conference of Turkish Linguistics</source>
          ,
          <year>2002</year>
          ) Current Research in Turkish Linguistics, pp.
          <fpage>182</fpage>
          -
          <lpage>192</lpage>
          . Eastern Mediterranean University Press (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Schüller</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Modeling Variations of First-Order Horn Abduction in Answer Set Programming</article-title>
          .
          <source>Fundamenta Informaticae</source>
          (
          <year>2016</year>
          ), to appear. arXiv:
          <volume>1512</volume>
          .08899 [cs.AI]
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Stoyanov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eisner</surname>
          </string-name>
          , J.:
          <article-title>Easy-first Coreference Resolution</article-title>
          .
          <source>In: International Conference on Computational Linguistics (COLING)</source>
          . pp.
          <fpage>2519</fpage>
          -
          <lpage>2534</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Vilain</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burger</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aberdeen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Connolly</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hirschman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>A Model-Theoretic Coreference Scoring Scheme</article-title>
          .
          <source>In: Message Understanding Conference (MUC-6)</source>
          . pp.
          <fpage>45</fpage>
          -
          <lpage>52</lpage>
          . ACL Anthology (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chapman</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crowley</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savova</surname>
          </string-name>
          , G.:
          <article-title>Coreference resolution: A review of general methodologies and applications in the clinical domain</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          <volume>44</volume>
          (
          <issue>6</issue>
          ),
          <fpage>1113</fpage>
          -
          <lpage>1122</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>