-

Proceedings of the ASSIN 2 Shared Task

and Textual Entailment in Portuguese

i ASSIN 2 is the second edition of the Evaluation of Semantic Similarity and Textual Inference (Avalia¸c˜ao de Similaridade Semˆantica e Inferˆencia textual ) in Portuguese, that took place as a parallel event with the STIL conference in 2019. Like its previous edition, it proposed a shared task on Semantic Similarity and Text Entailment; with the former ranking pairs of sentences from 1 to 5, and the latter labeling them as either entailment or non-entailment (but not paraphrases, in contrast with the first edition).

There are some notable di↵erences between the first and second edition of the shared task. Concerning the data, a new corpus of 10 thousand sentences was presented, but instead of text extracted from news articles, it contains much simpler sentence pairs, modeled after the SICK corpus. With sentences written on purpose for this task, some linguistic phenomena could be directly controlled. As a result, a word overlap baseline is not so powerful on ASSIN 2 as it was on ASSIN 1.

On the side of systems, we saw a reflection of the recent development of neural networks. While hand-engineered features and lexical resources are still useful, pretrained language models proved themselves as very helpful for both tasks evaluated.

This volume presents the main findings of the shared task organizers, and the descriptions of the strategies developed by the participants. With a total of nine of them, we are happy with the results of ASSIN 2. We leave a new dataset as a benchmark to evaluate the progress of this area in Portuguese, as well as the reflections upon its research directions.

February, 2020 ii Erick Fonseca Livy Real

Hugo Gon¸calo Oliveira Erick Fonseca Reviewers

B2W Digital/Grupo de Lingu´ıstica Computacional – USP, Brazil CISUC / DEI, Universidade de Coimbra, Portugal Instituto de Telecomunica¸c˜oes, Lisboa, Portugal

Organizing the ASSIN 2 Shared

Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Livy

Real , Erick Fonseca, Hugo Gon¸calo Oliveira

NILC at ASSIN 2 : Exploring

Multilingual Approaches . . . . . . . . . . . . . . . . . Marco A.

Sobrevilla

Cabezudo

, Marcio Ina´cio, Ana Carolina Rodrigues, Edresson Casanova, Rog´erio Figueredo de Sousa