=Paper= {{Paper |id=Vol-1881/Overview3 |storemode=property |title=Collective Elaboration of a Coreference Annotated Corpus for Portuguese Texts |pdfUrl=https://ceur-ws.org/Vol-1881/Overview3.pdf |volume=Vol-1881 |authors=Evandro Fonseca,Vinicius Sesti,Sandra Collovini,Renata Vieira,Ana Luísa Leal,Paulo Quaresma |dblpUrl=https://dblp.org/rec/conf/sepln/FonsecaSCVLQ17 }} ==Collective Elaboration of a Coreference Annotated Corpus for Portuguese Texts== https://ceur-ws.org/Vol-1881/Overview3.pdf

Collective Elaboration of a Coreference
Annotated Corpus for Portuguese Texts

Evandro Fonseca1 , Vinicius Sesti1 , Sandra Collovini1 , Renata Vieira1 ,
Ana Luísa Leal2 and Paulo Quaresma3

evandro.fonseca@acad.pucrs.br, vinicius.sesti@acad.pucrs.br,
sandra.abreu@acad.pucrs.br, renata.vieira@pucrs.br, analeal@umac.mo,
pq@di.uevora.pt

1
Pontifical Catholic University of Rio Grande do Sul (PUCRS)
2
University of Macau
3
University of Évora

Abstract. This paper describes the collaborative creation of a corpus
with coreference annotation for Portuguese. The annotation was per-
formed using the coreference annotation CORP, and the editing tool
CorrefVisual. The texts were automatically annotated and manually re-
vised by Portuguese speakers. As a result a new corpus for coreference
studies was produced for Portuguese.

1 Introduction
In this paper we describe the creation of a collaborative annotated corpus. Seven
teams participated in the task. The texts were chosen by the teams themselves.
As a result of this task, we created ‘Corref-PT’, a coreference corpus for Por-
tuguese. The texts submitted by the teams were first annotated with CORP
[10], a nominal coreference resolution tool for Portuguese. Then, the editing tool
CorrefVisual [28] was used for the manual revision of the previously annotated
texts. Agreement was measured with Kappa, considering the concordance among
team members and across teams.
The paper is organized as follows. Section 2 presents an overview of the prob-
lem of coreference resolution. Related work is presented in Section 3. Section 4
presents an overview of the corpus submission and information about participat-
ing teams. Section 5 describes the corpus annotation, including the distribution
of texts among annotators, annotation tools and annotation agreement. Section
6 describes the results of this Ibereval task: the Corref-PT corpus, its metrics
and a brief discussion regarding the annotation process and its problems. Finally,
in Section 7, the conclusions and future work are presented.

2 Coreference Resolution
Coreference resolution basically consists of finding different references to a same
entity in a text, as in the example: “A França resiste como único país da União
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

2 E. Fonseca et al.

Européia a não permitir o patenteamento de genes”. The noun phrases [único
país da União Européia a não permitir o patenteamento de genes] and [A França]
are considered coreferent. In other words, they belong to the same coreference
chain.
Coreference resolution may provide important input for other NLP tasks. One
example is the area of entity relation extraction, since coreference links may be
useful for extracting implicit relations [12]. Consider the following sentence:
“[Barack Obama], said today that the climate changes are a great threat for
the planet”. [The United States president] ...”. When identifying and creating a
coreference relation between [Barack Obama] and [the United States president],
it is possible to infer a relation between the entities [Barack Obama] and [United
States] (in which Barack Obama is the president of the United States). Also,
when we link Barack Obama with the president, it is possible to classify him as
a person, as well as to say that he has a relation with the United States.

3 Related Work

Coreference resolution is very important in understanding texts; thus, it is a
crucial step in many high-level natural processing tasks, ranging from informa-
tion extraction to text summarization or machine translation [30]. In general,
the evaluation of systems devoted to this task depends on reference corpora
(golden standards). There are, for example, English coreference annotated cor-
pora that have been used in coreference resolution tracks such as SemEval,
ACE and CoNLL [3,29,24,8,23,22]. SemEval (Evaluation Exercises on Seman-
tic Evaluation) includes, among others tasks, the Coreference Resolution task
[24], considering multiple languages (Catalan, Dutch, English, German, Italian
and Spanish). This task involved automatically detecting full coreference chains,
composed of named entities, pronouns, and full noun phrases. The datasets used
in SemEval task were extracted from five corpora: 1) the AnCora corpora [25]
for Catalan and Spanish; 2) the KNACK-2002 corpus [16] for Dutch; 3) the
OntoNotes Release 2.0 corpus for English [23]; 4) the TurBa-D/Z corpus [15] for
German; and 5) the LiveMemories corpus [26] for Italian.
CoNLL-2011 Coreference Task included a closed (limited to using the dis-
tributed resources) and an open track (unrestricted use of external resources).
The task was to automatically identify mentions of entities and events in texts
and to link the coreferring mentions together to form coreference chains. For
this, the participants could use information from other structural layers includ-
ing parsing, semantic roles, word sense and named entities. It was based on
OntoNotes 4.0 [22].
The OntoNotes is a large-scale corpus of general anaphoric coreference not
restricted to noun phrases or to a specified set of entity types [23,22]. In ad-
dition to coreference, the corpus provides other layers of annotation: syntactic
trees; propositions structures of verbs; partial verb and noun word senses; and
18 named entity types. OntoNotes is a multi-lingual resource with annotations
available in three languages: English, Chinese and Arabic.

69
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

Collective Elaboration of a Coreference Annotated Corpus 3

OntoNotes corpus is of crucial important for data modeling of linguistically
easier cases of coreference. Complex cases are being investigated more recently,
one of the main reasons for this is the lack of appropriated datasets [29]. The
ARRAU dataset is a multi-domain corpus with large-scale annotations of vari-
ous linguistic phenomena related to anaphora. A second release of the ARRAU
is presented in [29], and the authors not only focused on increasing the number
of documents, but also invested a considerable effort into improving the data
quality. The data is manually labeled for tasks such as coreference resolution,
bridging, mention detection, referentiality an genericity. The documents were
annotated for anaphoric information, using the MMAX (Multi-Modal Annota-
tion in XML) tool, which is specific for corpus annotation, with main focus in
the annotation of coreference [21]. The annotation followed the ARRAU guide-
lines, which focused on a more detailed representation of linguistic phenomena
related to anaphoric and coreference. The authors present the main differences
between ARRAU and two coreference corpus: ACE and OntoNotes. The dif-
ference between these corpora stands out, ARRAU considers different types of
noun phrases, including markables that do not participate in coreference chains
(singletons and non-referentials). Also, this corpus combines coreference with
bridging, and for the third release of ARRAU, the authors plan to focus on
bridging.
One of the difficulties for the creation of annotated corpora is the availability
of specialists for this task. An alternative is crowd-sourcing approach, which
uses a non-expert crowd to annotate text, driven by cost, speed and scalability
[17]. In [3] Phrase Detectives, an interactive online game for creating annotated
anaphoric coreference corpora using GWAP (game-with-a-purpose) approach
is presented. The Phrase Detectives Corpus 1.0 contains 45 documents from
Wikipedia articles and narrative text, with 6,452 markables.
HAREM is a joint evaluation effort for Portuguese (Avaliação de Sistemas
de Reconhecimento de Entidades Mencionadas) [27]. This contest had the pur-
pose of studying expressions regarding proper names (mentioned entities). The
Second HAREM took took place in 2008 and it included the task of identifying
the semantic relations between mentioned entities, called ReRelEM track (Re-
conhecimento de Relações entre Entidades Mencionadas). This was concerned
with the automatic detection of relations between named entities in a document
[11]. ReRelEM, although maintaining the restriction to named entities, is also
a source of coreference annotation, since the authors proposed the detection of
relations between named entities, including coreference, represented by the re-
lation of Identity (entities with the same referent, defined to all the categories
and whose instances must had the same category).
Another related Portuguese corpus is the Summ-it corpus [4,1]. It is a cor-
pus gathering annotations of various linguistic levels, including coreference, but
also morphological, syntactic and rhetorical relations. Summ-it has a total of
560 coreference chains with an average of 3 noun phrases for chain, where the
largest chain has 16 members (noun phrases). Recently, a new version of Summ-
it corpus was enriched with two layers: named entities and the relations that

70
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

4 E. Fonseca et al.

occur between these entities [6,5], this version is called Summ-it++1 and is de-
scribed in [1]. The coreference information is the same from the original Summ-it
corpus. However, other layers of linguistic (morpho-syntactic) information were
generated by other tools and converted to a new format based on SemEval [24].
Garcias’s corpus also contains coreference annotation, but only for Person en-
tities [13]. It is also given in the SemEval format. It is a multilingual corpus2
including Portuguese, Galician and Spanish. One of the motivations for this col-
laborative task of creating an annotated corpus is, therefore, to increase the
number of annotated coreference data for Portuguese. Instead of creating such
annotated corpus from scratch, we adopted a different methodology, we proposed
the edition of coreference chains produced by coreference resolution tool.

4 Corpus submission and participating teams

The general objective of the proposed task was a collective elaboration of a Por-
tuguese annotated corpus for nominal coreference. For that, each participant
team submitted a corpus of their own interest. Seven teams submitted their cor-
pus. The resulting corpus is composed by journalistic texts [20]; by miscellaneous
texts (books, magazines, journalistic, among others) [7]; and Wikipedia dump
articles, selected randomly. The corpus is further described in Section 6.1.
The first phase of the task consisted of corpus submission by participant
teams. Each participant team submitted around 30 texts written in Portuguese,
considering domains of their own interest. The proposed average size for these
texts was 1200 tokens each. Plus, each team justified the reason(s) of corpus
choice, including the related studies. Seven groups submitted texts for annota-
tion. Three main text sets were submitted, as described below and detailed in
Table 1.

– CSTNews[20] is a corpus developed for multi-document summarization and
used for several studies in Portuguese, mainly for researches on discourse
phenomena. This was divided in five parts, one for each group from USP.
– A sample of the larger corpus PAROLE [7], compiled in the scope of the Eu-
ropean project LE-PAROLE. For each language involved in the project, a 20
million word corpus was built with harmonized design, composition and cod-
ification, including a 250.000 word subcorpus, tagged with POS information
and revised manually.
– Wikipedia articles written in Portuguese language. This corpus is an extract
composed by 30 entire articles, each with more than 1100 and less than 1400
words, randomly selected from the Wikipedia dump from 26/03/2017.

There was a training phase, when participants got used to the editing tool[28].
We provided one text annotated by CORP[10]. Each team’s members revised the
coreference chains and could ask questions about the task.
1
http://www.inf.pucrs.br/linatural/summit_plus_plus.html
2
http://gramatica.usc.es/ marcos/coling14.tar.bz

71
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

Collective Elaboration of a Coreference Annotated Corpus 5

Team Corpus Texts
USP 1 1/5 28
USP 2 2/5 28
USP 3 CST-News 3/5 28
USP 4 4/5 28
USP 5 5/5 25
UFBA Wikipedia 30
EVORA Le-Parole 12
Table 1. Submitted texts

Finally, there was the annotation phase. First, all texts were annotated with
CORP, then, each team received its own corpus plus a few extra texts included for
measuring team level agreement (according to Table 2). The corpus annotation
phase is described in detail in the next section.

5 Corpus Annotation
5.1 Text Distribution among Annotators
The corpora were received and distributed among team members in a way to
allow agreement measures. For that we used first a set of three texts chosen by
the organizers for calculating inter team level agreement; secondly a subgroup
of four texts from each submitted corpus should be annotated by all members in
its respective team. In Table 2, we exemplify how we organized the distribution
of texts. This example considers a scenario of a team with three annotators and
a corpus of sixteen texts. Each member annotated one of our chosen texts for
inter team agreement (TK1, TK2 and TK3), whereas four texts of the submitted
corpus were replicated to all annotators of that team (TG1, TG2, TG3 and TG4).

Participant 1 Participant 2 Participant 3
TK1 TK2 TK3
TG1 TG1 TG1
TG2 TG2 TG2
TG3 TG3 TG3
TG4 TG4 TG4
TG5 TG6 TG7
TG8 TG9 TG10
TG11 TG12 TG13
TG14 TG15 TG16
Table 2. Distribution scheme

The texts were then annotated with coreference and distributed among each
team. The annotation consisted in editing the generated chains. Next, we de-

72
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

6 E. Fonseca et al.

scribe CORP, the coreference resolution tool [10], and CorrefVisual [28], the
editing tool used in this task.

5.2 Annotation tools

The annotation task is based on previous annotation generated by a coreference
resolution tool and the edition of the generated chains with the help of an editing
tool, as described below.

CORP is a coreference resolution tool for Portuguese [9] which was built on the
basis of deterministic rules, in the line with previous tools proposed for English
[19,18]. An important difference from these previous works for English is, how-
ever, the inclusion of semantic knowledge, which is provided by Onto.PT [14].
The tool produces 2 outputs: the first in XML, containing the original text, the
list of sentences, tokens, part-of-speech, coreference chains (Figure 1) and single
mentions. This format allows the interoperability with other applications. The
second output format is given in HTML for the visualization of generated coref-
erence chains, which can be seen through the tool’s web interface3 . A desktop
version is also available for download4 .

CorrefVisual is a tool developed in order to allow the edition of coreference
chains annotated with CORP. It provides a user-friendly graphical interface for
visualizing and replacing NPs in other coreference chains. It also allows the edit-
ing of noun phrases, creation and deletion of chains and persistency of changes.
The interface displays information in three different main panels: the first dis-
plays the text and selected noun phrases; the second displays coreference chains,
each in a particular subpanel; and the third displays single (non-coreferent) noun-
phrases (unique mentions). Each chain is associated with one color in order to
show the different chains.
Upon selection of noun phrases, they are highlighted in the text according to
their chain’s color. In figure 2, one chain is highlighted. CorrefVisual is available
for download5 .

3
http://ontolp.inf.pucrs.br/corref/
4
http://www.inf.pucrs.br/linatural/wordpress/index.php/recursos-e-
ferramentas/corp-coreference-resolution-for-portuguese/
5
http://www.inf.pucrs.br/linatural/wordpress/index.php/recursos-e-
ferramentas/correfvisual/

73
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

Collective Elaboration of a Coreference Annotated Corpus 7

Fig. 1. CORP - XML coreference chains

Fig. 2. CorrefVisual - mentions in a chain are highlighted in the text panel.

5.3 Annotation agreement
We measured annotation agreement on the basis of Kappa statistics. Kappa is
usually used to measure concordance among canonical elements. For the corefer-
ence task, we need to calculate the agreement of complex elements: coreference

74
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

8 E. Fonseca et al.

chains. Basically, a coreference chain may have two or more noun phrases. Thus,
for the correct calculation of agreement, we need to transform these chains into
items that may be analysed as a category.
One way of perform that is to transform each chain into coreference pairs.
That is, for the chain C={a,b,c}, wherein ‘a’, ‘b’ and ‘c’ represent noun phrases,
we represent it as follows: P={(a,b),(a,c),(b,c)}.
To perform the calculation, we need to consider the set of documents (D) and
the set of annotators (A). For example, for a set of documents D={d1, d2, d3}
and set of annotators A={a1, a2, a3}, we create, for each document dx belonging
to the set of documents D, the set ∪dx , where ∪dx is the union of all coreference
chains annotated for that document; such that Ux ={dxa1 ∪ dxa2 ∪dxa3 }.
Assuming that annotator a1 has created two coreference chains: c1a1 ={a, b,
c}, c2a1 ={d, f}, and annotators a2 and a3 have considered only one, c1a2 ={a,b,c},
c1a3 ={a,b,c}, while d and f are annotated as non-coreferent by both, the result-
ing union set is Ud1 = {a, b, c, d, f }.
Then we transform the union set into pairs and determine which pairs are
considered coreferent or not by each annotator. The set of pairs is PU d1 = {(a,b),
(a,c), (a,d), (a,f), (b,c), (b,d), (b,f), (c,d), (c,f), (d,f)}.
In Table 3, we can see the Kappa calculation of this example. Each pair
represents an item to be classified as Coreferent or Non-Coreferent. The pairs
(a, b), (a, c) and (b, c) appear in the same coreference chain for three annotators,
indicating that they considered them coreferent. The pairs (a, f), (b, d), (b, f),
(c, d) and (c, f) were considered non-coreferent by all anotators. For the pair
(d, f), there was a disagreement between the annotators. Thus, the Coreferent
class receives ‘1’ and the Non-Coreferent class receives ‘2’. This process made
for document d1 is repeated for other documents. We calculate Kappa [2] from
the values represented in Table.

Pair Coreferent Non-Coreferent S
a,b 3 0 1
a,c 3 0 1
a,d 0 3 1
a,f 0 3 1
b,c 3 0 1
b,d 0 3 1
b,f 0 3 1
c,d 0 3 1
c,f 0 3 1
d,f 1 2 0.333
N=10 C1=10 C2=20 Z=9.333
Table 3. Dataset for ‘d1 ’, ‘a1 ’, ‘a2 ’ and ‘a3 ’

75
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

Collective Elaboration of a Coreference Annotated Corpus 9

5.4 Kappa Results
Table 4 shows the resulting Kappa for each team and across teams. The lowest
concordance was 0.41 and the highest was 0.64 for teams (intragroup). Kappa
was 0.51 when calculated among different teams (intergroup). For intergroup
agreement, only six teams were considered due to a few missing annotation
texts.
According to the interpretation given in Table 5 [31], the resulting Kappa
indicates mainly moderate agreement, which is in line with what was expected
for such a challenging task.

Team Members Overlay Texts Kappa
USP 1 3 4 0.51
USP 2 4 4 0.48
USP 3 3 4 0.55
USP 4 3 4 0.64
USP 5 3 4 0.57
UFBA 3 4 0.43
EVORA 2 2 0.41
INTERGROUP 6 3 0.51
Table 4. Concordance intra and intergroup

Kappa Agreement
<0 Less than chance agreement
0.01 - 0.20 Slight agreement
0.21 - 0.40 Fair agreement
0.41 - 0.60 Moderate agreement
0.61 - 0.80 Substantial agreement
0.81 - 0.99 Almost perfect agreement
Table 5. Interpretation of Kappa

6 Corref-PT
As a result of this IBEREVAL task, we obtained a coreference corpus for Por-
tuguese: Corref-PT. The corpus was annotated as an effort made by seven teams,
with a total of twenty-one Portuguese native speakers annotators, varying among
students and professors in the area of computational linguists. The corpus is
available in CORP’s XML (Figure 1) and SemEval format [24] used by other
well known coreference corpora, such as Ontonotes [22], Summ-it++ [1] and
Garcia’s corpus [13]. Corref-PT is available for download6 .
6
http://www.inf.pucrs.br/linatural/wordpress/index.php/recursos-e-
ferramentas/corref-pt/

76
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

10 E. Fonseca et al.

In Table 6, we show the SemEval format. It is available in a single file, con-
taining all texts. Each text document is contained within a “#begin document
ID” line and another line containing only “#end document”. Each sentence’s
information is organized vertically, with one token per line, and a blank line af-
ter the last token of each sentence. The information associated with each token
is available in columns (separated by a tab character - “\t”). The annotation
columns contain, respectively: Token’s ID in sentence; the word or multiword it-
self; lemma; each word’s Part-of-speech tagging; features (gender and number);
Head, denoting if the word is a head word in the NP (if so, this field receives
’0’) and coreference information, where each coreferent noun phrase starts with
“( ”, followed by the chain’s ID. Note that the “) ” just occurs in the last NP
token. Basically, coreferent NPs receives the same chain ID.

ID Token Lemma POS Feat Head Corref
1 Segundo segundo prp _ _ _
2 informações informar n F=P 0 _
3 de de prp _ _ _
4 a o art F=S _ _
5 assessoria assessoria n F=S 0 _
6 de de prp _ _ _
7 o o art M=S _ (2
8 apresentador apresentador n M=S 0 2)
9 , , _ _ _
10 ele ele pron-pers M=3S=NOM 0 (2)
11 não não adv _ _ _
12 poderia poder v-fin COND=3S _ _
13 comparecer comparecer v-inf _ _ _
14 a a prp _ _ _
15 o o art M=S _ _
16 Deic prop M=S 0 _
17 em em prp _ _ _
18 a o art F=S _ (5
19 quarta-feira quarta-feira n F=S 0 5)
20 ...
Table 6. Corref-PT - Semeval format

6.1 Corpus Metrics

Corref-PT is composed by texts from the CSTNews corpus [20]; from the Parole
corpus (miscellaneous texts from books, magazines, journalistic, among others)
[7]; Wikipedia articles, selected randomly; and also a few scientific texts from

77
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

Collective Elaboration of a Coreference Annotated Corpus 11

Fapesp Magazine7 . Metrics about number of texts, tokens, mentions, coreferen-
tial mentions, coreference chains and chains sizes are shown in Table 7.

Coreferent Coreference Largest Avg. Chain
Corpus Texts Tokens Mentions
Mentions Chains Chain Size
CST-News 137 54445 14680 6797 1906 25 3.6
Le-Parole 12 21607 5773 2202 573 38 3.8
Wikipedia 30 44153 12049 4973 1308 53 3.8
Fapesp Magazine 3 3535 1012 496 111 33 4.5
Total 182 123740 33514 14468 3898 53 3.7
Table 7. Corref-PT - Corpus Metrics

6.2 Annotation task evaluation
The annotators evaluated the task regarding a few issues inquired through a
survey on Google Forms. Fifteen of the 21 participants sent their answers. They
were asked about their confidence level in the annotation, whether the previous
automatic annotation was helpful for the task and about the necessity of noun
phrase edition for the task (considering that noun phrase identification was made
automatically by a parser). We can see in Figure 3 that few annotators had high
confidence in their annotation. Most participants were not sure about this issue.

Fig. 3. Question 1 - confidence level

Regarding previous annotation (Figure 4), most participants were ambivalent
whether this helps or not the process, but a greater number thought it was
helpful.

Fig. 4. Question 2 - usefulness of previous annotation
7
http://revistapesquisa.fapesp.br/

78
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

12 E. Fonseca et al.

Regarding noun phrase edition (Figure 5), 60% of participants strongly agreed
that is indispensable for the annotation task. That is indeed a crucial pre-
processing requirement for build the chains, that the references are correctly
identified. The main problem here was that the task was in fact mostly fixed
regarding mention detection, and it was based on the parser’s NP chunks. Sug-
gestions given by the annotators were most related to CorrefVisual’s usability -
one major problem was related to noun phrase edition. That was very difficult to
handle by the annotators, since the mention detection is required for identifying
coreference chains correctly, but the tool was not primarily meant for that.

Fig. 5. Question 3 - noun phrase edition required

7 Conclusion
In this paper, we presented a collaborative coreference annotation task which
resulted in a coreference corpus for Portuguese with nearly 4000 chains. Con-
sidering Summ-it++, a previous available resource of the kind, with around
500 chains, we now have a coreference annotated corpus with 8 times as many
chains. The resource is available both in the SemEval format and in CORP’s
XML8 . The annotated corpus can be visualized in the CorrefVisual tool9 . For
the next steps, we have to improve questions regarding automatic mention detec-
tion, which seems to be a major pre-processing issue for this task, and similarly
we have also to improve the ways for their manual editing, if we consider fur-
ther annotation tasks. Regarding the annotation agreement, we can see that
there is mainly moderate agreement. However, as future work, a revision of this
annotation should be done in order to improve the quality of annotation.

References
1. A. Antonitsch, A. Figueira, D. Amaral, E. Fonseca, R. Vieira, and S. Collovini.
Summ-it++: an enriched version of the summ-it corpus. In N. Calzolari,
K. Choukri, T. Declerck, S. Goggi, M. Grobelnik, B. Maegaard, J. Mariani,
H. Mazo, A. Moreno, J. Odijk, and S. Piperidis, editors, Proceedings of the Tenth
International Conference on Language Resources and Evaluation (LREC 2016),
8
http://www.inf.pucrs.br/linatural/wordpress/index.php/recursos-e-
ferramentas/corref-pt/
9
http://www.inf.pucrs.br/linatural/wordpress/index.php/recursos-e-
ferramentas/correfvisual/

79
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

Collective Elaboration of a Coreference Annotated Corpus 13

pages 2047–2051, Paris, France, 2016. European Language Resources Association
(ELRA).
2. J. Carletta. Assessing agreement on classification tasks: the kappa statistic. Com-
putational linguistics, 22(2):249–254, 1996.
3. J. Chamberlain, M. Poesio, and U. Kruschwitz. Phrase detectives corpus 1.0
crowdsourced anaphoric coreference. In N. Calzolari, K. Choukri, T. Declerck,
S. Goggi, M. Grobelnik, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk,
and S. Piperidis, editors, Proceedings of the Tenth International Conference on
Language Resources and Evaluation (LREC 2016), pages 2039–2046, Paris, France,
2016. European Language Resources Association (ELRA).
4. S. Collovini, T. I. Carbonel, J. T. Fuchs, J. C. Coelho, L. Rino, and R. Vieira.
Summ-it: Um corpus anotado com informações discursivas visando a sumarização
automática. In Proceedings of V Workshop em Tecnologia da Informação e da
Linguagem Humana , Rio de Janeiro, RJ, Brasil, pages 1605–1614, 2007.
5. S. Collovini de Abreu and R. Vieira. Relp: Portuguese open relation extraction.
Knowledge Organization, 44(3):163–177, 2017.
6. D. O. F. do Amaral and R. Vieira. NERP-CRF: uma ferramenta para o reconhec-
imento de entidades nomeadas por meio de conditional random fields. 6(1):41–49,
2014.
7. M. F. B. do Nascimento, A. Mendes, and L. Pereira. Providing on-line access
to portuguese language resources: Corpora and lexicons. In Proceedings of the
International Conference on Language Resources and Evaluation , Portugal, 2004.
8. G. Doddington, A. Mitchell, M. Przybocki, L. Ramshaw, S. Strassel, and
R. Weischedel. The automatic content extraction (ace) program: Tasks, data,
and evaluation. In M. T. Lino, M. F. Xavier, F. Ferreira, R. Costa, and R. Silva,
editors, Proceedings of the 4th International Conference on Language Resources
and Evaluation – LREC 2004, pages 837–840, Lisboa, 2004.
9. E. B. Fonseca, V. Sesti, A. Antonitsch, A. A. Vanin, and R. Vieira. Corp - uma
abordagem baseada em regras e conhecimento semântico para a resolução de cor-
referências. Linguamatica, 9(1):3–18, 2017.
10. E. B. Fonseca, R. Vieira, and A. Vanin. Corp: Coreference resolution for por-
tuguese. In 12th International Conference on the Computational Processing of
Portuguese, Demo Session (PROPOR), 2016.
11. C. Freitas, C. Mota, D. Santos, H. G. Oliveira, and P. Carvalho. Second HAREM:
advancing the state of the art of named entity recognition in portuguese. In Pro-
ceedings of the International Conference on Language Resources and Evaluation,
LREC, Valletta, Malta, 2010.
12. R. Gabbard, M. Freedman, and R. Weischedel. Coreference for learning to extract
relations: yes, virginia, coreference matters. In Proceedings of the 49th Annual
Meeting of the Association for Computational Linguistics: Human Language Tech-
nologies: short papers-Volume 2, pages 288–293. Association for Computational
Linguistics, 2011.
13. M. Garcia and P. Gamallo. Multilingual corpora with coreferential annotation of
person entities. In Proceedings of the 9th edition of the Language Resources and
Evaluation Conference - LREC, pages 3229–3233, 2014.
14. H. Gonçalo Oliveira. Onto. PT: Towards the Automatic Construction of a Lexical
Ontology for Portuguese. PhD thesis, Ph. D. thesis, Univ. of Coimbra/FST, 2012.
15. E. W. Hinrichs, S. Kübler, and K. Naumann. A unified representation for mor-
phological, syntactic, semantic, and referential annotations. In Proceedings of the
Workshop on Frontiers in Corpus Annotations II: Pie in the Sky, CorpusAnno ’05,

80
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

14 E. Fonseca et al.

pages 13–20, Stroudsburg, PA, USA, 2005. Association for Computational Linguis-
tics.
16. V. Hoste and G. De Pauw. Knack-2002: a richly annotated corpus of dutch writ-
ten text. In Proceedings of The Fifth international conference on Language Re-
sources and Evaluation, pages 1432–1437, Genoa, Italy, 2006. European Language
Resources Association, European Language Resources Association.
17. J. Howe. Crowdsourcing: Why the Power of the Crowd Is Driving the Future of
Business. Crown Publishing Group, New York, NY, USA, 1 edition, 2008.
18. H. Lee, A. Chang, Y. Peirsman, N. Chambers, M. Surdeanu, and D. Jurafsky.
Deterministic coreference resolution based on entity-centric, precision-ranked rules.
volume 39, pages 885–916. Computational Linguistics - MIT Press, 2013.
19. H. Lee, Y. Peirsman, A. Chang, N. Chambers, M. Surdeanu, and D. Jurafsky. Stan-
ford’s multi-pass sieve coreference resolution system at the conll-2011 shared task.
In Proceedings of the Fifteenth Conference on Computational Natural Language
Learning: Shared Task. Association for Computational Linguistics, 2011.
20. E. G. Maziero, M. L. del Rosario Castro Jorge, and T. A. S. Pardo. Identifying
multidocument relations. In Natural Language Processing and Cognitive Science,
Proceedings of the 7th International Workshop on Natural Language Processing
and Cognitive Science, NLPCS 2010, In conjunction with ICEIS 2010, Funchal,
Madeira, Portugal, June 2010, pages 60–69, 2010.
21. C. Müller and M. Strube. Mmax: A tool for the annotation of multi-modal corpora.
In Proceedings of the 2nd IJCAI Workshop on Adaptive Text Extraction and Mining
- IJCAI 2001, Seattle, Washington, 2001.
22. S. Pradhan, L. Ramshaw, M. Marcus, M. Palmer, R. Weischedel, and N. Xue.
Conll-2011 shared task: Modeling unrestricted coreference in ontonotes. In Pro-
ceedings of the Fifteenth Conference on Computational Natural Language Learning:
Shared Task, pages 1–27. Association for Computational Linguistics, 2011.
23. S. S. Pradhan, E. Hovy, M. Marcus, M. Palmer, L. Ramshaw, and R. Weischedel.
Ontonotes: A unified relational semantic representation. In Proceedings of the In-
ternational Conference on Semantic Computing, ICSC ’07, pages 517–526, Wash-
ington, DC, USA, 2007. IEEE Computer Society.
24. M. Recasens, L. Màrquez, E. Sapena, M. A. Martí, M. Taulé, V. Hoste, M. Poesio,
and Y. Versley. Semeval-2010 task 1: Coreference resolution in multiple languages.
In Proceedings of the 5th International Workshop on Semantic Evaluation, pages
1–8. Association for Computational Linguistics, 2010.
25. M. Recasens and M. A. Martí. Ancora-co: Coreferentially annotated corpora for
spanish and catalan. Language Resources and Evaluation, 44(4):315–345, 2010.
26. K. J. Rodríguez, F. Delogu, Y. Versley, E. Stemle, and M. Poesio. Anaphoric
annotation of wikipedia and blogs in the live memories corpus. In N. Calzo-
lari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner, and
D. Tapias, editors, Proceedings of the International Conference on Language Re-
sources and Evaluation - LREC. European Language Resources Association, 2010.
27. D. Santos, N. Cardoso, N. Seco, and R. Vilela. Breve introduçao ao harem.
HAREM, a primeira avaliaçao conjunta de sistemas de reconhecimento de enti-
dades mencionadas para português: documentaçao e actas do encontro, Linguateca,
2007.
28. M. d. O. Tubino and M. M. S. Silva. Visualização, manipulação e refinamento
de correferência em língua portuguesa. Trabalho de conclusão de curso, Pontifícia
Universidade Católica do Rio Grande do Sul, 2015.

81
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)

Collective Elaboration of a Coreference Annotated Corpus 15

29. O. Uryupina, R. Artstein, A. Bristot, F. Cavicchio, K. Rodriguez, and M. Poesio.
ARRAU: Linguistically-Motivated Annotation of Anaphoric Descriptions. In Pro-
ceedings of the Tenth International Conference on Language Resources and Evalua-
tion (LREC 2016), pages 2058–2062, Portorož, Slovenia, 2016. European Language
Resources Association (ELRA).
30. K. van Deemter and R. Kibble. What is coreference, and what should coreference
annotation be? In Proceedings of the Workshop on Coreference and Its Applica-
tions, CorefApp ’99, pages 90–96, Stroudsburg, PA, USA, 1999. Association for
Computational Linguistics.
31. A. J. Viera, J. M. Garrett, et al. Understanding interobserver agreement: the kappa
statistic. Fam Med, 37(5):360–363, 2005.