<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>R. Navigli)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Exploring the Dissociated Nucleus Phenomenon in Semantic Role Labeling</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tommaso Bonomo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simone Conia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roberto Navigli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sapienza University of Rome, Dipartimento di Ingegneria Informatica, Automatica e Gestionale (DIAG)</institution>
          ,
          <addr-line>Via Ariosto 25, Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Dependency-based Semantic Role Labeling (SRL) is bound to dependency parsing, as the arguments of a predicate are identified through the token that heads the dependency relation subtree of the argument span. However, dependency-based SRL corpora are susceptible to the dissociated nucleus problem: when a subclause's semantic and structural cores are two separate words, the dependency tree chooses the structural token as the head of the subtree, coercing the SRL annotation into making the same choice. This leads to undesirable consequences: when directly using the output of a dependency-based SRL method in downstream tasks it is useful to work with the token representing the semantic core of a subclause, not the structural core. In this paper, we carry out a linguistically-driven investigation on the dissociated nucleus problem in dependency-based SRL and propose a novel algorithm that aligns predicate-argument structures to the syntactic structures from Universal Dependencies to select the semantic core of an argument. Our analysis shows that dissociated nuclei appear more often than one might expect, and that our novel algorithm greatly increases the richness of the semantic information in dependency-based SRL. We release the software to reproduce our experiments at https://github.com/SapienzaNLP/semdepalign.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Semantic Role Labeling</kwd>
        <kwd>Dependency Parsing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. SRL and Dependency Parsing</title>
      <p>Algorithm 1: SemDepAlign
input: the role node role_node; the root node of the
UD dep-tree root_ud.
output: the head node of the role in the UD dep-tree.</p>
      <p>
        Both SRL and Dependency Parsing investigate how
words in the same sentence relate to each other,
respectively in a semantic or syntactic sense. The Conference
on Computational Natural Language Learning (CoNLL) role_tokens ← get _tokens(role_node)
organized several Shared Tasks regarding both tasks, cul- ud_role_subtree ← root_ud
minating in the CoNLL-2008 Shared Task [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] that asked min_nodes ← SymDiff (get _tokens(root_ud),
participants to identify both types of relation within an role_tokens)
English-only corpus. This task can be seen as the first for node ← BFS(root_ud):
occurrence of dependency-based SRL, as it explicitly ties subtree_tokens ← get _tokens(node)
the SRL annotations to the dependency relation tree of extra_nodes ← SymDiff (subtree_tokens,
the sentence. The authors of the Shared Task imple- mrionl_e_ntoodkeesn← s) min(min_nodes, extra_nodes)
mented their own constituency-to-dependency parser to return ud_role_subtree
obtain the syntactic dependency relation trees, which
are vulnerable by construction to the dissociated nucleus
phenomenon.
      </p>
      <p>
        The dependency relation annotation scheme adopted 3. Re-associating Dissociated
in both CoNLL-2008 and its multilingual successor Nuclei
CoNLL-2009 [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] impacts the output of dependency-based
SRL systems trained on these training sets. If one inspects Having established that the current annotations in
either a training sample of CoNLL-2009 or an output of CoNLL-2009 are susceptible to the dissociated nucleus
a system trained on it, one can expect to encounter the phenomenon, we aim to mitigate this issue by
introducdissociated nucleus phenomenon [10, ch. 23]. For exam- ing a subtree alignment algorithm that leverages the
ple, the training sample “That is a service to the nation” characteristics of Universal Dependencies [13, 14, UD] to
presents a dissociated nucleus: the structural and seman- collapse arguments that have been placed on structural
tic functions of the subclause “to the nation” are fulfilled tokens with their corresponding semantic tokens. UD
by two separate tokens, ‘to’ and ‘nation’, respectively. explicitly addresses the dissociated nucleus issue by
exThe annotation provided within CoNLL-2009 identifies tending the definition of a nominal to encompass the
enthe syntactic core ‘to’ with the argument A2 for the nom- tire nominal extended projection, following the linguistic
inal predicate ‘service’ because it is the head of the orig- theory proposed by Grimshaw [15]. The nominal head is
inal dependency relation subtree corresponding to the used as the referential core and the adposition is treated
argument span. Consequently, many tokens annotated as a functional marker [14, Section 3.1.1]. When
conas arguments are simple adpositions of little semantic structing the dependency tree structures, UD guidelines
significance. This significant detail impacts downstream [14, Section 2.1.1] indicate that the head of a particular
tasks that use SRL outputs as input: if we wanted to ex- subclause should be its main content word, i.e. the
nomtract relations or perform disambiguation on the example inal head. Parsers trained on UD Treebanks recognize
above, we would have much more interest in focusing dependency subtrees where the head is the semantic core
on the word ‘nation’ than the adposition ‘to’. of the subclause, efectively mitigating the dissociated
      </p>
      <p>A way to quantify this phenomenon is to look at the nucleus phenomenon. We leverage this characteristic of
frequency of part-of-speech (POS) tags of role tokens UD parsers to automatically annotate the whole
CoNLLin the corpus. We are interested in the POS label of 2009 corpus using trankit [16], which emerges as the
“Preposition or subordinating conjunction”, which is the strongest UD parser in the comparison we include in
second-most frequent tag with 76,821 role tokens out of Appendix B.
a total of 475,069, or ~17% of all the role tokens. Table 5
in the Appendix provides a complete breakdown over all 3.1. SemDepAlign: subtree alignment
POS classes in the English-split of CoNLL-2009.</p>
      <p>We argue that both the training corpora and
dependency-based SRL systems should identify the
semantic core of an argument span as the head of the
argument. In Appendix A we provide further examples of this
phenomenon in non-English partitions of CoNLL-2009.</p>
      <sec id="sec-2-1">
        <title>We introduce SemDepAlign, a novel algorithm for syn</title>
        <p>tactic parse semi-alignment from the dependency
annotations in CoNLL-2009 to UD, described in Algorithm 1.
SemDepAlign is a deterministic subtree aligning
algorithm that, for each role token  associated with a
predicate, finds the UD subtree that most closely matches the
original subtree headed by  in the original dependency
tree of CoNLL-2009. It then returns the head node ′ of
AM-LOC</p>
        <p>At
last night ’s</p>
        <p>rally</p>
        <p>AM-LOC
pmod
nmod
sufix</p>
        <p>loc
nmod
amod
case
case
nmod:poss</p>
        <p>A0
they
A0
obl</p>
        <p>root
sbj</p>
        <p>loc
call.05</p>
        <p>called
call.05
nsubj
root
oprd
pmod</p>
        <p>nmod
A1
on
their followers</p>
        <p>A1
nmod
obl
case
xcomp</p>
        <p>A2
to
im</p>
        <p>prd
be firm</p>
        <p>A2
cop
mark</p>
      </sec>
      <sec id="sec-2-2">
        <title>We apply SemDepAlign to CoNLL-2009 to mitigate the</title>
        <p>dissociated nucleus phenomenon, obtaining the Aligned- Although we demonstrate that re-associated nuclei in
CoNLL2009 dataset. After the application of SemDe- dependency-based SRL provide additional semantic
inpAlign, the number of role token annotations that are formation, an important research question is whether
modified is considerable over all CoNLL-2009 languages integrating our proposal into current systems can lead
(between 21% and 32% of the total roles), except for Czech to a change in performance. Therefore, we build on top
(~7%). of the strong SRL model proposed by Conia and Navigli</p>
        <p>To gain a better understanding of the diferences that [18] and design a new approach that jointly learns both
the alignment process introduces, we consider the an- types of role annotations, i.e. the original role tokens and
notations of the original tokens that are modified by
the UD subtree, which will be assigned the role label in
the aligned SRL annotation.</p>
        <p>As shown in Algorithm 1, SemDepAlign starts from
the UD root node (root_ud), loops over the nodes of the
tree through a breadth-first search (BFS), and finds the
node which heads the subtree with minimal symmetric
set diference ( SymDif ) between its tokens and the set
of tokens in the original role span (role_tokens). The
symmetric diference between two sets of tokens 1 and
2 is defined through the set operations diference (∖) and
union (∪) like so: (1∖2) ∪ (2∖1). Intuitively, if the
symmetric diference between the original and the UD
subtree is the empty set, they match exactly and we can
simply select the head of the UD subtree as the role token.</p>
        <p>Otherwise, selecting the head of the UD subtree with the
minimal symmetric diference compared to the original
subtree is equivalent to selecting the subtree with the
most overlap with the original span.</p>
        <p>Figure 1 gives an example of the output of
SemDepAlign: at the top of the figure we display the original
annotation of the sentence derived from the English split
of CoNLL-2009, with the presence of a dissociated
nucleus in three of the four roles for the predicate “called”;
in the bottom part we show the output of our alignment
procedure, which moves the role annotations to the
tokens that perform the semantic function.</p>
        <sec id="sec-2-2-1">
          <title>3.2. Aligned-CoNLL09: analysis</title>
          <p>SemDepAlign and the resulting aligned role tokens. We
measure three metrics on these two sets to evaluate their
semantic richness:
• Number of content words, i.e. words that are
either nouns, adjectives, adverbs, or verbs, which
indicates that the heads identified by
SemDepAlign are more varied (2713 vs. 680 for English,
3.99× );
• Number of unique tokens, which indicates that
the heads identified by SemDepAlign are less
repetitive (1906 vs. 477, 4× );
• Number of unique synsets, which indicates that
the heads identified by SemDepAlign are
associated with diferent meanings (1387 vs. 481, 2.88 × )
according to a Word Sense Disambiguation
system [17].</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>From Table 1 we can see how SemDepAlign dramatically</title>
        <p>increases the semantic content of role tokens in English,
Spanish and German, identifying more than 4× the
number of content words, more than 2.5× the number of
unique tokens and around 3× the number of unique
synsets compared to the original annotations. We find
a smaller but consistent increase of semantic content
in Catalan and Chinese, whilst in Czech all metrics are
similar, indicating a reduced efect of SemDepAlign.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Integrating re-associated nuclei</title>
      <p># modified roles
Content words
Unique tokens
Unique synsets</p>
      <p>Catalan
O A
the aligned ones. In brief, this architecture derives a con- unchanged. We conduct our experiments on all of the
textualized word representation for each word in a sen- language splits of CoNLL-2009, namely, Catalan, Czech,
tence from a BERT-like Pretrained Language Model [19, German, English, Spanish, and Chinese.
PLM]. It then applies a custom “fully-connected”
stackedBiLSTM sequence encoder to derive a predicate-aware Results Table 2 compares the results of our
jointrepresentation, which is in turn used to derive a predicate- modeling alignment system against our baseline on the
and argument-specific embedding for each word in the CoNLL-2009 validation and test sets. Importantly, we
sentence. Finally, an argument-specific fully-connected observe that the additional task of modeling the semantic
BiLSTM is applied to further encode each word with re- core of an argument does not significantly alter the
perspect to a specific predicate, from which it derives the formance (very similar F1 score on the test), despite the
ifnal score distribution over the role vocabulary through a added dificulty brought by the identification of
semansimple linear classifier. The model is trained to minimize tic cores. Table 3, instead, provides a breakdown of the
the sum of categorical cross-entropy losses on predicate F1 scores on predicate, role and aligned role predictions.
identification, predicate disambiguation and argument The aligned system is in line with the baseline despite
identification and classification. being tasked with a more complex objective. More
inter</p>
      <p>To adapt this model for our joint modeling task, we estingly, we observe that the F1 score on the semantic
duplicate the linear classifier for the semantic roles and heads is comparable, indicating that the model is able to
set two diferent targets for the two role classifiers: the identify UD-aligned roles efectively.
original role token and label from CoNLL-2009 and the
aligned role token and label obtained with SemDepAlign.</p>
      <p>Our final loss adds terms for UD-aligned argument iden- 5. Semantic roles in AMR graphs
tification and classification to the original loss.</p>
      <sec id="sec-3-1">
        <title>Experimental setup We use XLM-RoBERTa-base [20] as the underlying PLM, and leave other hyperparameters</title>
      </sec>
      <sec id="sec-3-2">
        <title>We also develop an evaluation method based on the Abstract Meaning Representation formalism [11, AMR] for Semantic Parsing. The interconnection between SRL and AMR is well-known across the literature [21, 22]: both</title>
        <p>Test dataset
LORELEI
Weblog and WSJ
Xinhua MT
BOLT DF MT
BOLT DF English
Proxy reports
Average</p>
      </sec>
      <sec id="sec-3-3">
        <title>Syntactic information has always been considered impor</title>
        <p>tant for recognizing semantic frames in SRL.
Marcheggiani and Titov [24] were among the first to model the
dependency information provided in dependency-based
SRL, followed most recently by Xia et al. [25], Fei et al.
[26]. These works difer in respect of modeling choices
and in the kind of extra syntactic data to be included (e.g.
constituency trees, POS tags).</p>
        <p>We also considered other syntactic frameworks, such
as HPSG [27], to align the role annotations. HPSG
robustly models the relationship between semantic cores
of a sentence, but the lack of automatic tools with an
acceptable performance and the dificulty in aligning
dependency-based subtrees to HPSG spans compelled us
to use UD.
tasks aim to construct a semantic representation of a
sentence, although SRL, covering only surface-level
semantic frames, is more superficial than AMR, which aims
to provide a more complete and in-depth structured repre- 7. Conclusion
sentation that can interconnect diferent semantic frames.</p>
        <p>Given that AMR aims to abstract away from the specific In this paper, we conducted an in-depth investigation
syntax of a sentence to focus only on its semantic content, on the dissociated nucleus issue in dependency-based
our intuition is that a dependency-based SRL system is SRL. We introduced SemDepAlign, a novel method to
more “semantic” if its predictions of predicate-role pairs align predicate-argument structures in SRL with
synare contained in the AMR annotation for the same sen- tactic parses from the Universal Dependencies project,
tence. which addresses the dissociated nucleus phenomenon.</p>
        <p>Therefore, we devise the AMR-precision metric: Our analyses and experiments in SRL modeling
demongiven a sentence , its golden annotated AMR graph strate that our approach to dissociated nuclei brings more
AMR with token-node alignments available and a set semantic richness whilst remaining competitive on
stanof dependency-based SRL predictions, we filter the pre- dard benchmarks.
dicted semantic frames so that the predicate of each frame
is present in the golden AMR graph. We then compute 8. Limitations
the ratio between the number of role tokens that are
connected to their predicate in the AMR graph over the total
number of roles predicted.</p>
        <p>Given the SRL system introduced in Section 4, we apply
it to the AMR3.0 (LDC2020T021) test datasets, keeping
both the standard and the aligned role predictions. We
then compute the AMR-precision for both sets of
predicted roles, and compare them in Table 4. It is clear that
aligned roles are more likely to be present in the
corresponding AMR graph of a sentence, with a consistent
diference in AMR-precision in all test datasets except
Proxy reports. This particular dataset has a “templatic,
report-like structure” as mentioned in the AMR3.0
guidelines, so it is possible that the reduced performance is
due to this particular characteristic.</p>
        <p>This finding can pave the way for future work
exploring the linkage between these two fundamental semantic
tasks, as also suggested in the multi-layer annotation
provided in MOSAICo [23].</p>
      </sec>
      <sec id="sec-3-4">
        <title>A limitation of our work is that it builds upon existing</title>
        <p>dependency parsers trained on Universal Dependencies.
These parsers have reached high robustness across many
languages, between 85 and 93 in Labeled Attachment
Score (LAS) on the languages present in CoNLL-2009. But
the error that these automatic methods necessarily
encounter propagates directly to our alignment algorithm,
with no way of recovering from the mistake. This
limitation would be even more impactful in languages where
the automatic dependency parser performed worse,
presumably in low-resource settings, preventing a robust
expansion of our work to these settings.</p>
        <p>A more methodological limitation of our contributions
concerns the availability of the CoNLL-2009 dataset.
Although it is a well-established corpus in the SRL literature,
it has a proprietary licensing scheme and one must
acquire the resource from the Linguistic Data Consortium
(LDC). We trust that, given the importance of the corpus,
this will not limit the relevance of our work.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <sec id="sec-4-1">
        <title>Simone Conia gratefully acknowledges the support of the PNRR MUR project PE0000013-FAIR, which fully funds his fellowship.</title>
      </sec>
      <sec id="sec-4-2">
        <title>Roberto Navigli also gratefully acknowledges the sup</title>
        <p>port of the CREATIVE project (CRoss-modal
understanding and gEnerATIon of Visual and tExtual content), which
is funded by the MUR Progetti di Rilevante Interesse
Nazionale programme (PRIN 2020).
International Conference on Language Resources 747. doi:10.18653/v1/2020.acl-main.747.
and Evaluation (LREC’14), European Language Re- [21] L. Chen, P. Wang, R. Xu, T. Liu, Z. Sui, B. Chang,
sources Association (ELRA), Reykjavik, Iceland, ATP: AMRize then parse! enhancing AMR parsing
2014, pp. 4585–4592. URL: http://www.lrec-conf. with PseudoAMRs, in: M. Carpuat, M.-C. de
Marnorg/proceedings/lrec2014/pdf/1062_Paper.pdf . efe, I. V. Meza Ruiz (Eds.), Findings of the
Associa[14] M.-C. de Marnefe, C. D. Manning, J. Nivre, tion for Computational Linguistics: NAACL 2022,
D. Zeman, Universal Dependencies, Com- Association for Computational Linguistics,
Seatputational Linguistics 47 (2021) 255–308. tle, United States, 2022, pp. 2482–2496. URL: https:
URL: https://aclanthology.org/2021.cl-2.11. //aclanthology.org/2022.findings-naacl.190. doi: 10.
doi:10.1162/coli_a_00402. 18653/v1/2022.findings-naacl.190.
[15] J. Grimshaw, Argument Structure, The MIT Press, [22] R. Navigli, Natural Language Understanding:
In</p>
        <p>Cambridge, MA, 1990. structions for (Present and Future) Use, in:
Proceed[16] M. V. Nguyen, V. Lai, A. P. B. Veyseh, T. H. Nguyen, ings of the Twenty-Seventh International Joint
ConTrankit: A light-weight transformer-based toolkit ference on Artificial Intelligence, IJCAI-18,
Internafor multilingual natural language processing, in: tional Joint Conferences on Artificial Intelligence
Proceedings of the 16th Conference of the Euro- Organization, 2018, pp. 5697–5702. URL: https://doi.
pean Chapter of the Association for Computational org/10.24963/ijcai.2018/812. doi:10.24963/ijcai.</p>
        <p>Linguistics: System Demonstrations, 2021. 2018/812.
[17] R. Orlando, S. Conia, F. Brignone, F. Cecconi, [23] S. Conia, E. Barba, A. C. Martinez Lorenzo, P.-L.</p>
        <p>R. Navigli, AMuSE-WSD: An all-in-one mul- Huguet Cabot, R. Orlando, L. Procopio, R.
Navtilingual system for easy Word Sense Disam- igli, MOSAICo: a multilingual open-text
semanbiguation, in: Proceedings of the 2021 Confer- tically annotated interlinked corpus, in: K. Duh,
ence on Empirical Methods in Natural Language H. Gomez, S. Bethard (Eds.), Proceedings of the
Processing: System Demonstrations, Association 2024 Conference of the North American Chapter
for Computational Linguistics, Online and Punta of the Association for Computational Linguistics:
Cana, Dominican Republic, 2021, pp. 298–307. Human Language Technologies (Volume 1: Long
URL: https://aclanthology.org/2021.emnlp-demo.34. Papers), Association for Computational
Linguisdoi:10.18653/v1/2021.emnlp-demo.34. tics, Mexico City, Mexico, 2024, pp. 7990–8004.
[18] S. Conia, R. Navigli, Bridging the gap in URL: https://aclanthology.org/2024.naacl-long.442.
multilingual semantic role labeling: a language- doi:10.18653/v1/2024.naacl-long.442.
agnostic approach, in: Proceedings of the [24] D. Marcheggiani, I. Titov, Encoding sentences with
28th International Conference on Computa- graph convolutional networks for semantic role
lational Linguistics, International Committee beling, in: Proceedings of the 2017 Conference
on Computational Linguistics, Barcelona, on Empirical Methods in Natural Language
ProSpain (Online), 2020, pp. 1396–1410. URL: cessing, Association for Computational Linguistics,
https://aclanthology.org/2020.coling-main.120. Copenhagen, Denmark, 2017, pp. 1506–1515. URL:
doi:10.18653/v1/2020.coling-main.120. https://aclanthology.org/D17-1159. doi:10.18653/
[19] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: v1/D17-1159.</p>
        <p>Pre-training of deep bidirectional transformers for [25] Q. Xia, R. Wang, Z. Li, Y. Zhang, M. Zhang,
language understanding, in: Proceedings of the Semantic role labeling with heterogeneous
2019 Conference of the North American Chap- syntactic knowledge, in: Proceedings of the
ter of the Association for Computational Linguis- 28th International Conference on
Computatics: Human Language Technologies, Volume 1 tional Linguistics, International Committee
(Long and Short Papers), Association for Com- on Computational Linguistics, Barcelona,
putational Linguistics, Minneapolis, Minnesota, Spain (Online), 2020, pp. 2979–2990. URL:
2019, pp. 4171–4186. URL: https://aclanthology.org/ https://aclanthology.org/2020.coling-main.266.</p>
        <p>N19-1423. doi:10.18653/v1/N19-1423. doi:10.18653/v1/2020.coling-main.266.
[20] A. Conneau, K. Khandelwal, N. Goyal, V. Chaud- [26] H. Fei, S. Wu, Y. Ren, F. Li, D. Ji, Better
comhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, bine them together! integrating syntactic
conL. Zettlemoyer, V. Stoyanov, Unsupervised cross- stituency and dependency representations for
selingual representation learning at scale, in: Pro- mantic role labeling, in: Findings of the Association
ceedings of the 58th Annual Meeting of the Associa- for Computational Linguistics: ACL-IJCNLP 2021,
tion for Computational Linguistics, Association for Association for Computational Linguistics,
OnComputational Linguistics, Online, 2020, pp. 8440– line, 2021, pp. 549–559. URL: https://aclanthology.
8451. URL: https://aclanthology.org/2020.acl-main. org/2021.findings-acl.49. doi: 10.18653/v1/2021.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>A. Dissociated nuclei in non-English samples of CoNLL-2009</title>
      <p>findings-acl.49. Original sentence:
[27] C. Pollard, I. Sag, Head-Driven Phrase Structure “Don Antonio se encontraba en su casa cuando sonó el
Grammar, Studies in Contemporary Linguistics, timbre de la puerta.”
University of Chicago Press, 1994. URL: https:// Translation:
books.google.it/books?id=Ftvg8Vo3QHwC. “Don Antonio was at his home when the doorbell rang.”
[28] M. Straka, UDPipe 2.0 prototype at CoNLL 2018 Dissociated nucleus:</p>
      <p>UD shared task, in: Proceedings of the CoNLL 2018 The role “en su casa” (“at his home”) for predicate
‘enconShared Task: Multilingual Parsing from Raw Text traba’ (‘was’) is tagged as arg2-loc on the token ‘en’
to Universal Dependencies, Association for Com- (‘in’) instead of the semantic nucleus ‘casa’ (‘home’).
putational Linguistics, Brussels, Belgium, 2018, pp.
197–207. URL: https://aclanthology.org/K18-2020. A.4. Chinese
doi:10.18653/v1/K18-2020.
[29] P. Qi, Y. Zhang, Y. Zhang, J. Bolton, C. D. Man- Original sentence:
ning, Stanza: A Python natural language processing 巴拉克 在 民意 测验 中 一直 表现 不 佳 。
toolkit for many human languages, in: Proceed- Transliteration:
ings of the 58th Annual Meeting of the Associa- “Barak in public opinion test in continuously performance
tion for Computational Linguistics: System Demon- no good.”
strations, 2020. URL: https://nlp.stanford.edu/pubs/ Translation:
qi2020stanza.pdf . “Barak has consistently underperformed in the polls.”
Dissociated nucleus:
In the clause 在 民意 测验 (“in the public opinion polls”)
for the nominal predicate 佳 (‘good’), as the token 在
(‘in’) is tagged as the LOC role, instead of the more
semantic 测验 (‘polls’).</p>
      <sec id="sec-5-1">
        <title>A.1. Catalan</title>
        <sec id="sec-5-1-1">
          <title>Original sentence:</title>
          <p>“Piqué recomana les fusions entre empreses per millorar
la rendibilitat.”
Translation:
“Piqué recommends mergers between companies to
improve profitability.”
Dissociated nucleus:
In the clause “per millorar” (“to improve”), ‘per’ (‘to’) is
tagged as argM-fin for predicate ‘recomana’
(‘recommends’) instead of the head of the subclause ‘millorar’
(‘improve’).</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>B. Universal Dependency parsers</title>
      <sec id="sec-6-1">
        <title>We consider three among the best of-the-shelf depen</title>
        <p>dency parsers, namely, trankit [16], UDPipe [28] and
Stanza [29]. Table 6 compares the reported evaluation
of each parser on standard treebanks for Catalan, Czech,
German, English, Spanish and Chinese. We choose
trankit as it achieves a higher UAS and LAS than the
two alternatives in all languages except Spanish (slightly
worse than UDPipe), with a considerable margin in
Chinese.
Catalan AnCora</p>
        <p>Czech PDT
German GSD
English EWT
Spanish AnCora</p>
        <p>Chinese
Simplified GSD</p>
        <p>Average</p>
        <p>System</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Gildea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jurafsky</surname>
          </string-name>
          , Automatic Labeling of Semantic Roles,
          <source>Computational Linguistics</source>
          <volume>28</volume>
          (
          <year>2002</year>
          )
          <fpage>245</fpage>
          -
          <lpage>288</lpage>
          . URL: https://doi. org/10.1162/089120102760275983. doi:
          <volume>10</volume>
          .1162/ 089120102760275983.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Màrquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Carreras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. C.</given-names>
            <surname>Litkowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Stevenson</surname>
          </string-name>
          ,
          <article-title>Semantic Role Labeling: An Introduction to the Special Issue</article-title>
          ,
          <source>Computational Linguistics</source>
          <volume>34</volume>
          (
          <year>2008</year>
          )
          <fpage>145</fpage>
          -
          <lpage>159</lpage>
          . URL: https://doi.org/10.1162/coli.
          <year>2008</year>
          .
          <volume>34</volume>
          . 2.145. doi:
          <volume>10</volume>
          .1162/coli.
          <year>2008</year>
          .
          <volume>34</volume>
          .2.145.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Guan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Guo</surname>
          </string-name>
          , X. Cheng,
          <article-title>Event coreference resolution with their paraphrases and argument-aware embeddings</article-title>
          , in: D.
          <string-name>
            <surname>Scott</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Bel</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          Zong (Eds.),
          <source>Proceedings of the 28th International Conference on Computational Linguistics</source>
          ,
          <source>International Committee on Computational Linguistics</source>
          , Barcelona,
          <source>Spain (Online)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>3084</fpage>
          -
          <lpage>3094</lpage>
          . URL: https: //aclanthology.org/
          <year>2020</year>
          .coling-main.
          <volume>275</volume>
          . doi:
          <volume>10</volume>
          . 18653/v1/
          <year>2020</year>
          .coling-main.
          <volume>275</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Marasović</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Frank,</surname>
          </string-name>
          <article-title>SRL4ORL: Improving opinion role labeling using multi-task learning with semantic role labeling</article-title>
          , in: M.
          <string-name>
            <surname>Walker</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Ji</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Stent (Eds.),
          <source>Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          , New Orleans, Louisiana,
          <year>2018</year>
          , pp.
          <fpage>583</fpage>
          -
          <lpage>594</lpage>
          . URL: https://aclanthology.org/ N18-1054. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N18</fpage>
          -1054.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , P. Liang, G. Fu,
          <article-title>Enhancing opinion role labeling with semantic-aware word representations from semantic role labeling</article-title>
          , in: J.
          <string-name>
            <surname>Burstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Doran</surname>
          </string-name>
          , T. Solorio (Eds.),
          <source>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>641</fpage>
          -
          <lpage>646</lpage>
          . URL: https://aclanthology.org/N19-1066. doi:
          <volume>10</volume>
          .18653/ v1/
          <fpage>N19</fpage>
          -1066.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.-M.</given-names>
            <surname>Kim</surname>
          </string-name>
          , E. Hovy,
          <article-title>Extracting opinions, opinion holders, and topics expressed in online news media text</article-title>
          , in: M.
          <string-name>
            <surname>Gamon</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Aue (Eds.),
          <source>Proceedings of the Workshop on Sentiment and Subjectivity in Text, Association for Computational Linguistics</source>
          , Sydney, Australia,
          <year>2006</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . URL: https://aclanthology.org/W06-0301.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lawrence</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Reed</surname>
          </string-name>
          ,
          <article-title>Argument mining: A survey</article-title>
          ,
          <source>Computational Linguistics</source>
          <volume>45</volume>
          (
          <year>2019</year>
          )
          <fpage>765</fpage>
          -
          <lpage>818</lpage>
          . URL: https://aclanthology.org/J19-4006. doi:
          <volume>10</volume>
          . 1162/coli_a_
          <fpage>00364</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Falke</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>Utilizing automatic predicateargument analysis for concept map mining</article-title>
          , in: C.
          <string-name>
            <surname>Gardent</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          Retoré (Eds.),
          <source>Proceedings of the 12th International Conference on Computational Semantics (IWCS) - Short papers</source>
          ,
          <year>2017</year>
          . URL: https: //aclanthology.org/W17-6909.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hajič</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ciaramita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Johansson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kawahara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Martí</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Màrquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Meyers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Nivre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Padó</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Štěpánek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Straňák</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Surdeanu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Xue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , The CoNLL-2009
          <source>Shared Task: Syntactic and Semantic Dependencies in Multiple Languages, in: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL</source>
          <year>2009</year>
          ): Shared Task,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2009</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          . URL: https://aclanthology.org/W09-1201.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L.</given-names>
            <surname>Tesnière</surname>
          </string-name>
          , Elements of Structural Syntax, John Benjamins,
          <year>2015</year>
          . URL: https://www.jbe-platform. com/content/books/9789027269997.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Banarescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bonial</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Georgescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Grifitt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Hermjakob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Knight</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Koehn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Palmer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <article-title>Abstract Meaning Representation for sembanking</article-title>
          , in: A.
          <string-name>
            <surname>Pareja-Lora</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Liakata</surname>
          </string-name>
          , S. Dipper (Eds.),
          <source>Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse</source>
          ,
          <source>Association for Computational Linguistics</source>
          , Sofia, Bulgaria,
          <year>2013</year>
          , pp.
          <fpage>178</fpage>
          -
          <lpage>186</lpage>
          . URL: https://aclanthology.org/W13-2322.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Surdeanu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Johansson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Meyers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Màrquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Nivre</surname>
          </string-name>
          ,
          <article-title>The CoNLL 2008 shared task on joint parsing of syntactic and semantic dependencies</article-title>
          ,
          <source>in: CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning</source>
          , Coling 2008
          <string-name>
            <given-names>Organizing</given-names>
            <surname>Committee</surname>
          </string-name>
          , Manchester, England,
          <year>2008</year>
          , pp.
          <fpage>159</fpage>
          -
          <lpage>177</lpage>
          . URL: https:// aclanthology.org/W08-2121.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>M.-C. de Marnefe</surname>
            , T. Dozat,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Silveira</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Haverinen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Ginter</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Nivre</surname>
            ,
            <given-names>C. D.</given-names>
          </string-name>
          <string-name>
            <surname>Manning</surname>
          </string-name>
          ,
          <article-title>Universal Stanford dependencies: A cross-linguistic typology</article-title>
          , in: N.
          <string-name>
            <surname>Calzolari</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Choukri</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Declerck</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Loftsson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Maegaard</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mariani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Moreno</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Odijk</surname>
          </string-name>
          , S. Piperidis (Eds.),
          <source>Proceedings of the Ninth</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>