=Paper= {{Paper |id=Vol-2006/paper052 |storemode=property |title=Contrast-Ita Bank: A corpus for Italian Annotated with Discourse Contrast Relations |pdfUrl=https://ceur-ws.org/Vol-2006/paper052.pdf |volume=Vol-2006 |authors=Anna Feltracco,Bernardo Magnini,Elisabetta Jezek |dblpUrl=https://dblp.org/rec/conf/clic-it/FeltraccoMJ17 }} ==Contrast-Ita Bank: A corpus for Italian Annotated with Discourse Contrast Relations== https://ceur-ws.org/Vol-2006/paper052.pdf
                              Contrast-Ita Bank:
       A corpus for Italian Annotated with Discourse Contrast Relations
          Anna Feltracco                  Bernardo Magnini                         Elisabetta Jezek
     Fondazione Bruno Kessler          Fondazione Bruno Kessler                   University of Pavia
      University of Pavia, Italy             Trento, Italy                            Pavia, Italy
    University of Bergamo, Italy         magnini@fbk.eu                          jezek@unipv.it
      feltracco@fbk.eu


                     Abstract                           while in (1) and although in (2), or implicitly as in
                                                        (3).
     English. We present Contrast-Ita Bank, a
                                                           (1) The price of x increased of 5%, while the price
     corpus annotated with discourse contrast
                                                                 of y decreased of 2.3.%
     relations in Italian. We annotate both ex-
     plicit and implicit contrast relations, fol-          (2) Although it was raining, we went to the beach.
     lowing the schema proposed in the Penn                (3) Mary passed the exam. John failed it.
     Discourse Treebank. We provide and dis-
     cuss quantitative data about the new re-              We present Contrast-Ita Bank 1 , a corpus of Ital-
     source.                                            ian documents annotated with contrast, a very fre-
                                                        quent relation in discourse. We aim to understand
     Italiano. Presentiamo Contrast-Ita Bank,           how frequent the contrast relation is in discourse,
     un corpus annotato con relazioni di con-           when it is expressed explicitly and implicitly, and
     trasto in italiano. Abbiamo annotato sia           which are the connectives that convey contrast.
     relazioni esplicite che implicite, adottando       The final result of the annotation represents a first
     lo schema proposto nel Penn Discourse              step toward a corpus of discourse relations for
     Treebank. Portiamo e discutiamo dati               Italian, compatible with the Penn Discourse Tree-
     quantitativi sulla nuova risorsa.                  bank (PDTB) project (Prasad et al., 2007), the
                                                        largest and the most used corpus annotated with
                                                        discourse relations in the NLP field. A number
1    Introduction                                       of annotated corpora similar to the PDTB have
                                                        been realised since its creation, for instance, the
A relevant task in Natural Language Processing is       Prague Discourse TreeBank (Bejček et al., 2013)),
the automatic identification of semantic relations      the Chinese Discourse TreeBank (Zhou and Xue,
between portions of text, such as textual entail-       2015)), the Leeds Arabic Discourse TreeBank (Al-
ment, text similarity, and temporal relation. In this   Saif and Markert, 2010)).2 For Italian, a similar
contribution we focus on discourse contrast.            attempt was proposed by Tonelli et al. (2010),
   By discourse relation we mean a relation be-         which uses the PDTB scheme for the annotation
tween two parts of a coherent sequence of sen-          of the LUNA conversational spoken dialogue cor-
tences, propositions or speeches (i.e. discourse).      pus. The authors annotated 60 real dialogues in
We consider as discourse contrast: i) cases in          the domain of software/hardware troubleshooting.
which one of the two parts (henceforth arguments)       Another project for Italian inspired by the PDTB
is similar to the other in many aspects but differ-     is proposed by Pareti and Prodanof (2010) and it is
ent in one aspect for which they are compared, as       focused on the relation of attribution, i.e “the re-
in example (1), where both situations refer to a        lation of ownership between abstract objects and
change in the price, but with different values; ii)     individuals or agents” (Prasad et al., 2007, p. 40).
cases in which one argument is denying an expec-           Resources manually annotated with discourse
tation that is triggered from the other argument, as    relation have been used for instance for develop-
in (2), where ‘going to the beach’ denies the ex-
                                                          1
pectation that, since it is raining, one would stay         https://hlt-nlp.fbk.eu/technologies/
                                                        contrast-ita-bank
home. Contrast in text can be conveyed explicitly,        2
                                                            Prasad et al. (2014) propose an overview of projects also
by mean of a lexical element (connective), as by        mentioning resources for French, Turkish and Hindi.
ing methods and tools for the automatic identifi-                tag CONCESSION is used for cases in which “the
cation and disambiguation of explicit marked or                  highlighted differences are related to expectations
implicitly conveyed discourse relations3 , for the               raised by one argument which are then denied by
identification of the spans of text that are linked              the other” (Prasad et al., 2007).4
by relations (discourse segmentation), for the au-                  We consider as contrast both what has been
tomatic creation of a summary of a written text                  called formal contrast (Asher, 1993) and CON-
(text summarization) (Marcu, 1998), and for ma-                  TRAST (Prasad et al., 2007) on the one hand (see
chine translation (Meyer and Webber, 2013).                      Example (1) and (3)), and violation of expecta-
   The paper is structured as follows: Section 2 in-             tion (Asher, 1993) or CONCESSION (Carlson and
troduces the contrast relation; Section 3 describes              Marcu, 2001; Prasad et al., 2007) on the other
the annotation guidelines; Section 4 presents the                hand (as in Example (2)).
content of the resource and Section 5 discusses the
inter annotator agreement.                                       3       Adopting the PDTB Schema

2    The Contrast Relation                                       The Contrast-Ita Bank guidelines follow the
                                                                 PDTB 2.0 Annotation Manual (Prasad et al., 2007)
Discourse contrast has been described in various                 and the recent proposal by Webber et al. (2016).
theories and annotation schema. In the Rhetori-
                                                                    Following the PDTB 2.0, we annotate explicit
cal Structure Theory (RST) (Mann and Thompson,
                                                                 relations (see Examples (1) and (2) above) by
1988), contrast is defined as the relation between
                                                                 identifying the discourse connectives that trigger
two spans of texts such that the situations pre-
                                                                 the relations and the respective arguments. We
sented in the two spans are: “(i) comprehended as
                                                                 also annotate cases in which the relation is not
the same in many respects, (ii) comprehended as
                                                                 marked by a connective and can be inferred be-
differing in a few respects, and (iii) compared with
                                                                 tween adjacent sentences. These cases include im-
respect to one or more of these differences” (Mann
                                                                 plicit relations, i.e. the relation is not lexically
and Thompson, 1988). In the framework of RST,
                                                                 marked, as in Example (3), and alternatively lex-
Carlson and Marcu (2001) propose a discourse re-
                                                                 icalized (altlex) relations, i.e. the relation is in-
lations corpus; in their schema, contrast is part of a
                                                                 ferred by mean of another expression that is not a
broader class of relations called Contrast, together
                                                                 connective. By definition, these are cases where
with concession, described as “characterised by a
                                                                 a discourse relation is inferred between adjacent
violated expectation”(Carlson and Marcu, 2001).
                                                                 sentences in absence of a connective, but where
   In the Segment Discourse Representation The-
                                                                 providing a suggestion of connective leads to re-
ory framework, Asher and Lascarides (1993;
                                                                 dundancy in the expression of the relation (Prasad
2003) define contrast as a relation that involves
                                                                 et al., 2007). For instance, in ‘She prepared a cake.
constituents that are structurally similar but se-
                                                                 The reason: it was his birthday.’5 , a cause relation
mantically dissimilar. According to them, this re-
                                                                 is conveyed through ‘The reason:’; this relation is
lation includes cases of violation of expectation in
                                                                 a case of Altlex, since ‘The reason:’ is not a con-
which what can be inferred from one of the con-
                                                                 nective, and providing a suggestion of connective
stituents of a relation is denied in the second con-
                                                                 (e.g. because) will lead to redundancy. Differ-
stituent (Asher and Lascarides, 2003, p. 167).
                                                                 ently from the PDTB 2.0, we annotate implicit re-
   The Penn Discourse Treebank schema (Prasad
                                                                 lations also among comma separated clauses and
et al., 2007) proposes different senses of the con-
                                                                 altlex among non adjacent sentences.
nectives that provide a semantic description of the
                                                                    Specifically, our task involves: i) the annotation
discourse relation they convey. These senses are
                                                                 of the arguments of the relation (named Arg1 and
annotated as sense tags. The sense tag CON-
                                                                 Arg2, being Arg2, the argument in the clause that
TRAST applies to cases in which the two argu-
                                                                 is syntactically bound to the connective, and Arg1,
ments of a relation “share a predicate or a property
                                                                 the other one); ii) the annotation of the connec-
and a difference is highlighted with respect to the
                                                                 tives that convey contrast in the case of explicit
values assigned to the shared property”; the sense
                                                                 relations, of the first token of Arg2 in the case of
    3
      The task of identifying discourse relations in the form
                                                                     4
of a discourse connective taking two arguments is also called         In the PDTB3.0 hierarchy (Webber et al., 2016), the two
shallow discourse parsing and constituted a shared task of the   sense types belong to the class COMPARISON.
                                                                    5
CONLL conference in 2015 and 2016 (Xue et al., 2015).                 See a similar example in (Prasad et al., 2007, p.7).
implicit relations, and of the expression that make              PDTB2.0 we allow the annotation of more than
us inferring the relation in the case of altlex rela-            one sense for a connective and, thus, the possibil-
tions; iii) the tagging of the sense of the relation.            ity of marking e.g. both CONTRAST and CON-
An example from the PDTB2.0 Manual (Prasad                       CESSION Arg1.as.denier. Table 1 summarises
et al., 2007) is provided in (4), in which the con-              the definition of the tags.
nective appears underlined, Arg1 is in italics, and
Arg2 is in bold.                                                     Relation and Definition in the PDTB
                                                                     CONTRAST → the two Args share a predicate or a property
   (4) Most bond prices fell on concerns about this                  and the difference between the two situations (in the Args) is
         week’s new supply and disappointment that                   highlighted with respect to the values assigned to the property.
         stock prices didn’t stage a sharp decline. Junk             CONCESSION → expectations raised by one argument
         bond prices moved higher, however. (sense                   which are then denied by the other.
         tag: Contrast)                                               - Arg1.as.denier if Arg1 denies expectation
                                                                      - Arg2.as.denier if Arg2 denies expectation
    Connectives. We followed the PDTB also for
the definition of connectives that convey an ex-                 Table 1: CONTRAST and CONCESSION in the
plicit relation. They belong to three syntactic                  PDTB 3.0 (Webber et al., 2016).
classes: (i) subordinating conjunctions (e.g. when,
because); (ii) coordinating conjunctions (e.g. and,
or, but); (iii) discourse adverbials, including both             4      Contrast-Ita Bank
adverbs (e.g. however, instead), and prepositional
                                                                 Contrast-Ita Bank is based on a corpus of 169
phrases (e.g. on the other hand, as a result).
                                                                 news stories selected from Ita-TimeBank (Caselli
    Arguments. According to the PDTB, relations
                                                                 et al., 2011), for a total of 65,053 tokens (average
are annotated when they are connecting “two ab-
                                                                 length = about 385 tokens per document).7 For the
stract objects such as events, states, and propo-
                                                                 annotation we used the CAT tool (Bartalesi Lenzi
sitions (Asher, 1993)” (Prasad et al., 2007), that
                                                                 et al., 2012). The annotation was carried by one
are realised mostly as clauses, nominalisations, or
                                                                 expert annotator in about two weeks.
anaphoric expressions. We follow the same guide-
                                                                    We annotated explicit, implicit and altlex rela-
lines, including conjoined VPs, as proposed by
                                                                 tions of contrast for a total of 372 relations (aver-
Webber et al. (2016).6 We also adopt the Minimal-
                                                                 age 2.16 per document). Table 2 reports the data
ity Principle, according to which “only as many
                                                                 of the annotation. Explicit relations are the most
clauses and/or sentences should be included in
                                                                 common and correspond to 91% of all the rela-
an argument selection as are minimally required
                                                                 tions. We register a maximum number of 15 ex-
and sufficient for the interpretation of the rela-
                                                                 plicit relations in one document and an average
tion”(Prasad et al., 2007). This means that there is
                                                                 of 2 relations per document. Implicit relations are
no constrain on the length of an argument or that
                                                                 less frequent and occur 15 times inter-sentencially
more than a sentence can be annotated (i.e. punc-
                                                                 and 9 times infra-sentencially, for a total of 24 an-
tuation is generally not a limiting constrain).
                                                                 notations. This is different from the PDTB2.0,
    Senses of relations. We consider a broad se-
                                                                 in which the ratio between explicit and implicit
mantic definition of contrast, corresponding to
                                                                 for what concerns CONTRAST and COMPARI-
the PDTB sense tags CONTRAST and CON-
                                                                 SON, and their subtypes, is about 0.45, while in
CESSION. Specifically, we follow the PDTB 3.0
                                                                 Contrast-Ita Bank is ten time less. This might be
schema (Webber et al., 2016) in which CONCES-
                                                                 due to the fact that in Contrast-Ita Bank annota-
SION has two subtypes, depending on which argu-
                                                                 tors were asked to mark contrast, and it is possible
ment creates the expectation and which one denies
                                                                 that they simply fail to capture implicit relations,
it: if Arg2 creates an expectation that Arg1 denies,
                                                                 while in the PDTB2.0 annotators were asked to
the proper tag is CONCESSION Arg1.as.denier;
                                                                 mark also cases where no relation can be inferred
conversely, when Arg1 creates an expectation that
                                                                 between adjacent sentences, thus analysing in de-
Arg2 denies, the tag that needs to be used is
                                                                 tail if a relation appears between every pair of sen-
CONCESSION Arg2.as.denier. In line with the
                                                                 tences. Altlex relations are rarer: in Contrast-Ita
    6
      This change includes avoiding the annotation the span of       7
text that can be referred to both arguments in case of inter-          The same corpus is annotated with factuality information
sentencial VP conjoined arguments (e.g. in ‘Mary likes fruits    in Fact-Ita Bank (Minard et al., 2014) and partially annotated
but hates peaches, ‘Mary has not been annotated).                with negation in Fact-Ita Bank-Negation (Altuna et al., 2017).
                                                                                                               CONC.Arg1-denier


                                                                                                                                  CONC.Arg2-denier
                          Explicit   Implicit   AltLex     Total




                                                                                                                                                     % for Double
                                                                                                    CONTRAST
    CONTRAST                   87         12         3      102




                                                                                                                                                      Relation
                                                                                                      % for




                                                                                                                                      % for
                                                                                                                   % for
    CONC.Arg1-denier           21          0         1       22     connective         #      %
    CONC.Arg2-denier          201          8         3      212
    Double relations           32          4         0       36
                  Total       341         24         7      372
                                                                    ma                164   48.09      4.3                            87.2             8.5
               Density     0.0052     0.0003    0.0001   0.0056
                                                                    invece             41   12.02       78                            9.75           12.25
 Table 2: Contrast relations in Contrast-Ita Bank.                  mentre             36   10.56     88.9                             2.8             8.3
                                                                    però              35   10.26      2.9                            85.7            11.4
                                                                    nonostante         11    3.23                    100
Bank there are 7 cases.8     In these cases relations               anche se           10    2.93                                          90               10
                                                                    e                   8    2.35       75                                                  25
are alternatively lexicalized by: ‘anche al netto
                                                                    se                  8    2.35       75                                                  25
di’, ‘Certo’, ‘Il punto è che’, ‘Non’, ‘Peccato che’               eppure              7    2.05                    100
‘quella sı̀’, ‘Macchè’; none of these expressions is               comunque            4    1.17                    100
a connective.                                                       pur                 4    1.17                    100
                                                                    tuttavia            4    1.17                                       100
   Table 2 also shows that the per token density
                                                                    a dispetto di       2    0.59                    100
of contrast in the corpus is 0.0056, similar to the                 seppure             2    0.59                    100
PDTB (i.e. 0.0072).9                                                al contrario        1    0.29      100
   The most frequent sense tag is CONCESSION.                       al contrario di     1    0.29      100
Arg2-as-denier (i.e. when Arg2 denies an ex-                        da una parte..
                                                                                       1     0.29      100
                                                                       dall’altra
pectation that rises from Arg1), which covers                       in verità         1     0.29                                                        100
about 56% of the cases. CONTRAST covers                             in realtà         1     0.29                    100
almost a quarter of the cases and the two re-
lations have been annotated together 32 times                      Table 3: Contrast connectives in Contrast-Ita Bank
(out of the total 36 cases of double annotation).                  along with: total number, percentage over the total
CONCESSION.Arg1-as-denier is far less frequent                     cases, percentage of cases per sense tags.
both as single type as with other relations, and
has been annotated less than 10% of the cases.                        First we measured the agreement on recognis-
This subtype is associated to a limited set of                     ing explicit, implicit or altlex contrast relations
connectives: despite the list of connectives in                    (relation identification), considering the text span
Contrast-Ita Bank consists of 19 connectives (see                  marked by the annotators to signal a relation (e.g.
Table 3), 7 of them (e.g. nonostante) signal                       agreement if both marked ma or if one marked
CONCESSION.Arg1-as-denier all the times.                           se and the other anche se to signal the presence
   Not surprisingly, ma accounts for almost half                   of a contrast relation). We calculated the final
of the cases (the equivalent but is also the most                  score adopting the Dice’s coefficient (Rijsbergen,
used for these senses in the PDTB 2.0), and invece,                1979).10 The result is that annotators agree in 37
mentre, però for about a 10%. Table 3 shows that,                 cases (Dice 0.68). We consider this result reason-
as it happens for content words, the most frequent                 able given the difficulty of the task which has not
connectives are the most polysemous ones.                          to be underestimated. To identify contrast relation
                                                                   in a document means to distinguish cases in which
                                                                   a lexical element is playing the role of connective
5     Inter Annotator Agreement
                                                                   of contrast or it is not, and also to identify im-
We computed the agreement (IAA) between two                        plicit relations that by definition are not marked in
annotators on 18 documents (10.6% of the whole                     the text. In order to understand the motivations of
corpus), which followed the same written guide-                    these discrepancies, we have adopted a reconcilia-
lines. Data are reported in Table 4.                               tion strategy among annotators in which they were
                                                                   asked to motivate their choices with the possibil-
    8
      This is also the rarest type in the PDTB 2.0, among the      ity of revising them. After the reconciliation dis-
three considered here.
    9                                                                10
      It is possible that contrast is more frequent in corpora          The Dice’s coefficient measures how similar two sets are
of other domains, such as in documents reporting debates in        by dividing the number of shared elements of the two sets
which people contrast their opinions. However, with the idea       by the total number of elements they are composed by. This
of maximising the compatibility with the PDTB, we anno-            produces a value from 1, if both sets share all elements, to 0,
tated contrast on a corpus of news.                                if they have no element in common.
cussion 16 cases were reconciliated and the Dice                      # of relations by annotators: A= 57; B= 51; A ∩ B= 37
value increased to 0.84.                                                                      IAA on:
   In other cases disagreement remained. These                      relation identification                          0.68
                                                                    relation identification - post reconciliation    0.84
mainly include cases in which both annotators rec-
                                                                    connectives identification - explicit            0.68
ognized a discourse relation but one interpreted                    arguments span - exact match (Arg1; Arg2)        0.51; 0.70
the relation to be of contrast, while the other did                 arguments span - relaxed match (Arg1; Arg2)      0.89; 0.91
not. In many cases, these relations are conveyed                    sense type: CONTRAST - CONCESSION                0.73
by the coordinating conjunction ‘e’. We report an                   sense subtype: Arg1.as.denier - Arg2.as.denier   0.9
example in which one annotator recognized a con-
trast; while the other considered the arguments as                         Table 4: InterAnnotator Agreement.
non-contrasting parts of a description.
                                                                0.73, showing that recognising the type of contrast
   (5) [..] sono portatori sani di Talassemia Mayor
                                                                can be a controversial decision among annotators.
         e il loro bambino, Luca, cinque anni, è ta-
                                                                However, we believe that this result is fair, con-
         lassemico.11 [doc:5402]
                                                                sidering that the annotation regards non mutually
         CONTRAST vs NON-MARKED
                                                                exclusive types of the same class.
    Agreement on connectives identification is cal-                Finally, when there is agreement on CON-
culated considering if both annotators agree on                 CESSION, we applied the same formula to cal-
recognising the same explicit relation and the                  culate IAA between CONCESSION subtypes:
same exact span of text to be a connective (thus                Arg1.as.denier - Arg2.as.denier: agreement is 0.9.
excluding cases of altlex and implicit). In these               Specifically, annotators agree in 10 cases to mark
terms, cases of agreement for connectives identifi-             CONCESSION but in one case they disagree over
cation are a subset of cases of agreement already               the direction of the relation.12
captured by the relation identification. The result-               Overall, the IAA highlights that the main dif-
ing agreement is 0.68 (Dice’s coefficient).                     ficulties of annotating contrast concern: the rela-
    For the 37 cases of agreement on relation iden-             tion identification, especially for implicit and al-
tification, we calculated the IAA on the span of                tlex relations; the extent of the arguments: the
arguments in two ways. In the exact match mode,                 two annotators frequently do not mark exactly the
we have agreement if the two annotators consider                same tokens but it is very likely that their anno-
the exact span of text as Arg1 or Arg2 for the same             tations match at least for their 50%; sense type:
relation; in the relaxed match mode, we consider                one annotator tends to annotate also the CON-
agreement if the text span identified by the anno-              CESSION Arg2.as.denier when marking CON-
tators matches at least for its 50%. Agreement in               TRAST, while the other annotator does not.
the exact match for Arg1 is 0.51 and for Arg2 is
0.70; in the relaxed match mode is 0.89 for Arg1                6        Conclusion and Further Work
and 0.91 for Arg2. We expected the exact match
agreement difficult to reach. In fact, as described             We presented Contrast-Ita Bank, a corpus anno-
in Section 3, we adopt the Minimality Principle for             tated with discourse contrast relations in Italian.
the annotation of the arguments. The selection of               Following the PDTB annotation schema, we an-
the arguments span thus relies significantly on the             notated explicit, implicit and altelex relations of
interpretation of the annotators and cases in which             contrast. We also present the list of connectives
there is no exact match can be frequent.                        that convey contrast in the corpus. The new re-
    Agreement in identifying CONTRAST and                       source can be integrated with LICO, the Lexicon
CONCESSION (sense type) is calculated count-                    of Italian Connectives (Feltracco et al., 2016), val-
ing 1 point if annotators agree to assign (or not)              idating the list of connectives and adding examples
the same tag(s), 0.5 if one chooses a tag and the               from corpus to the connectives. Contrast-Ita Bank
other both, 0 for total disagreement. IAA is ob-                    12
                                                                     For the argument identification in the PDTB 2.0, Prasad
tained summing the points for each annotation and               et al. (2008) report an agreement of 90.2% for explicit re-
dividing by the total of 37 relations that both an-             lation and 85.1% for implicit (we do not calculate the value
notators identified. Agreement for sense type is                considering this granularity); when relaxing the match to par-
                                                                tial overlap, the two values increase to 94.5% and to 85.1%.
  11
     Eng.:[..] they are carrier of Talassemia Mayor and their   Additionally, authors report an agreement of 94% for sense
son, Luca, five years old, is thalassaemic.                     class, of 84% for sense type, and of 80% for the subtype level.
is distributed under a CC-BY-NC 4.0 licence.                Thomas Meyer and Bonnie Webber. 2013. Implici-
                                                              tation of discourse connectives in (machine) trans-
                                                              lation. In Proceedings of the 1st DiscoMT Work-
                                                              shop at the 51st Annual Meeting of the Association
References                                                    for Computational Linguistics (ACL 2013), number
Amal Al-Saif and Katja Markert. 2010. The Leeds               EPFL-CONF-192528.
 Arabic Discourse Treebank: Annotating Discourse
 Connectives for Arabic. In Proceedings of the              Anne-Lyse Minard, Alessandro Marchetti, and
 Seventh International Conference on Language Re-             Manuela Speranza. 2014. Event factuality in
 sources and Evaluation (LREC ‘10).                           italian: Annotation of news stories from the
                                                              ita-timebank. In Proceedings of the First Italian
                                                              Conference on Computational Linguistic (CLiC-it
Begoña Altuna, Manuela Speranza, and Anne-Lyse
                                                              2014).
  Minard. 2017. The Scope and Focus of Negation:
  A Complete Annotation Framework for Italian. Se-          Silvia Pareti and Irina Prodanof. 2010. Annotating
  mantics Beyond Events and Roles (SemBEaR) 2017,              attribution relations: Towards an italian discourse
  page 34.                                                     treebank. In Proceedings of the Seventh Interna-
                                                               tional Conference on Language Resources and Eval-
Nicholas Asher and Alex Lascarides. 2003. Logics of            uation (LREC’10).
  conversation. Cambridge University Press.
                                                            Rashmi Prasad, Eleni Miltsakaki, Nikhil Dinesh, Alan
Nicholas Asher. 1993. Reference to Abstract Objects           Lee, Aravind Joshi, Livio Robaldo, and Bonnie L
  in Discourse. Kluwer Academic Publishers, Dor-              Webber. 2007. The Penn Discourse Treebank 2.0
  drecht.                                                     Annotation Manual.

Valentina Bartalesi Lenzi, Giovanni Moretti, and            Rashmi Prasad, Nikhil Dinesh, Alan Lee, Eleni Milt-
  Rachele Sprugnoli. 2012. Cat: the celct annota-             sakaki, Livio Robaldo, Aravind K Joshi, and Bon-
  tion tool. In Proceedings of the Eight International        nie L Webber. 2008. The Penn Discourse Tree-
  Conference on Language Resources and Evaluation             Bank 2.0. In Proceedings of the Sixth International
  (LREC ‘12), pages 333–338.                                  Conference on Language Resources and Evaluation
                                                              (LREC’08), Marrakech, Morocco, May.
Eduard Bejček, Eva Hajičová, Jan Hajič, Pavlı́na
  Jı́nová, Václava Kettnerová, Veronika Kolářová,     Rashmi Prasad, Bonnie Webber, and Aravind Joshi.
  Marie Mikulová, Jiřı́ Mı́rovský, Anna Nedoluzhko,        2014. Reflections on the Penn Discourse Tree-
  Jarmila Panevová, Lucie Poláková, Magda                  bank, comparable corpora, and complementary an-
                                                              notation. Computational Linguistics.
  Ševčı́ková, Jan Štěpánek, and Šárka Zikánová.
  2013.          Prague Dependency Treebank 3.0.            Cornelis van Rijsbergen. 1979. Information retrieval.
  http://ufal.mff.cuni.cz/pdt3.0/.                            Butterworth, London.
Lynn Carlson and Daniel Marcu. 2001. Discourse tag-         Sara Tonelli, Giuseppe Riccardi, Rashmi Prasad, and
  ging reference manual. ISI Technical Report ISI-TR-         Aravind K Joshi. 2010. Annotation of discourse
  545, 54:56.                                                 relations for conversational spoken dialogs. In Pro-
                                                              ceedings of the Seventh International Conference on
Tommaso Caselli, Valentina Bartalesi Lenzi, Rachele           Language Resources and Evaluation (LREC’10).
  Sprugnoli, Emanuele Pianta, and Irina Prodanof.
  2011. Annotating events, temporal expressions and         Bonnie Webber, Rashmi Prasad, Alan Lee, and Ar-
  relations in Italian: the It-TimeML experience for          avind Joshi. 2016. A Discourse-Annotated Cor-
  the Ita-TimeBank. In Proceedings of the 5th Lin-            pus of Conjoined VPs. In Proceedings of the 10th
  guistic Annotation Workshop, pages 143–151.                 Linguistic Annotation Workshop held in conjunction
                                                              with ACL 2016 (LAW-X 2016), pages 22–31.
Anna Feltracco, Elisabetta Jezek, Bernardo Magnini,
  and Manfred Stede. 2016. Lico: A lexicon of ital-         Nianwen Xue, Hwee Tou Ng, Sameer Pradhan, Rashmi
  ian connectives. Proceedings of the Second Ital-            Prasad, Christopher Bryant, and Attapol Rutherford.
  ian Conference on Computational Linguistic (CLiC-           2015. The CoNLL-2015 Shared Task on Shallow
  it 2016), page 141.                                         Discourse Parsing. In CoNLL Shared Task, pages
                                                              1–16.
William C Mann and Sandra A Thompson. 1988.                 Yuping Zhou and Nianwen Xue. 2015. The chinese
  Rhetorical structure theory: Toward a functional the-       discourse treebank: A chinese corpus annotated with
  ory of text organization. Text-Interdisciplinary Jour-      discourse relations. Language Resources and Eval-
  nal for the Study of Discourse, 8(3):243–281.               uation, 49(2):397–431.
Daniel Marcu. 1998. The rhetorical parsing, summa-
  rization, and generation of natural language texts.
  Ph.D. thesis, University of Toronto.