=Paper=
{{Paper
|id=Vol-2765/154
|storemode=property
|title=KIPoS @ EVALITA2020: Overview of the Task on KIParla Part of Speech Tagging
|pdfUrl=https://ceur-ws.org/Vol-2765/paper154.pdf
|volume=Vol-2765
|authors=Cristina Bosco,Silvia Ballarè,Massimo Cerruti,Eugenio Goria,Caterina Mauri
|dblpUrl=https://dblp.org/rec/conf/evalita/BoscoBCGM20
}}
==KIPoS @ EVALITA2020: Overview of the Task on KIParla Part of Speech Tagging==
KIPoS @ EVALITA2020:
Overview of the Task on KIParla Part of Speech Tagging
Cristina Bosco? , Silvia Ballarè , Massimo Cerruti⊕ , Eugenio Goria⊕ , Caterina Mauri
?
Dipartimento di Informatica, Università degli Studi di Torino
Dipartimento di Filologia Classica e Italianistica, Università degli Studi di Bologna
⊕
Dipartimento di Studi Umanistici, Università degli Studi di Torino
Dipartimento di Lingue, Letterature e Culture Moderne, Università degli Studi di Bologna
{cristina.bosco,massimosimone.cerruti,eugenio.goria}@unito.it,
{silvia.ballare,caterina.mauri}@unibo.it
Abstract Written corpora are generally larger, are able to
provide a lot of information about the texts they
English. The paper describes the first task include, and may count on a vast array of computa-
on Part of Speech tagging of spoken lan- tional tools for morphological analysis and syntac-
guage held at the Evalita evaluation cam- tic parsing. Conversely, spoken corpora of Italian
paign, KIPoS. Benefiting from the avail- are generally smaller, often give a minimum of in-
ability of a resource of transcribed spo- formation concerning the speakers and the context
ken Italian (i.e. the KIParla corpus), which in which the interaction takes place and, finally,
has been newly annotated and released for provide at most basic PoS-tagging and lemmatiza-
KIPoS, the task includes three evaluation tion tools. This, of course, poses considerable lim-
exercises focused on formal versus infor- itations on the searches that may be performed on
mal spoken texts. The datasets and the these resources, eventually leading to a possible
results achieved by participants are pre- written language bias due to the different avail-
sented, and the insights gained from the ability and richness of information of written vs.
experience are discussed. spoken corpora (Linell, 2005).
Italiano. L’articolo descrive il primo As a consequence of this unbalance, corpus-
task sul Part of Speech tagging di lin- based sociolinguistic analyses of spoken Italian,
gua parlata tenutosi nella campagna di which need a comprehensive set of metadata,
valutazione Evalita. Usufruendo di una have rarely been put to the test on publicly avail-
risorsa che raccoglie trascrizioni di lin- able speech corpora. In fact, most sociolinguistic
gua italiana (il corpus KIParla), anno- studies have been conducted on ad hoc-collected
tate appositamente per KIPoS, il task è datasets, see inter al. (Alfonzetti, 2002; Mereu,
stato focalizzato intorno a tre valutazioni 2019).
con lo scopo di confrontare i risultati rag- The KIParla corpus (Mauri et al., 2019) (661k
giunti sul parlato formale con quelli ot- tokens approximately), which is available at the
tenuti sul parlato informale. Il corpus di website www.kiparla.it, has been designed
dati ed i risultati raggiunti dai parteci- to overcome some shortcomings of previous re-
panti sono presentati insieme alla discus- source tools. KIParla is a corpus of spoken Italian
sione di quanto emerso dall’esperienza di which encompasses various types of interactions
questo task. between speakers of different origins and socioe-
conomic backgrounds. It consists of speech data
collected in Bologna and Turin between 2016 and
1 Motivation 2019, and contains two independent modules, i.e.
Even (Bosco et al., 2020) though in the last KIP (cf. sec. 3) and ParlaTO. Among other things,
decades we have witnessed an increase in the re- KIParla provides a wide range of metadata, includ-
sources available for the study of spoken Italian, ing situational characteristics (such as the sym-
a great unbalance can still be observed between metrical vs. asymmetrical relationship between
spoken and written corpora, from different angles. the participants) and socio-demographic informa-
tion for each speaker (such as age and level of edu-
Copyright © 2020 for this paper by its authors. Use per-
mitted under Creative Commons License Attribution 4.0 In- cation). Nevertheless, the lack of PoS-tagging and
ternational (CC BY 4.0). lemmatization currently places severe limits on its
application. meaningfully used in the development of auto-
In order to enrich the scenario of investigation matic conversation systems and chatbots.
to be applied on the KIParla corpus, we proposed
the KIPoS task. Following the experience of the 2 Definition of the task
Evalita 2016 PoSTWITA task on PoS tagging Ital- Given the innovative features of KIParla, we
ian Social Media Texts (Bosco et al., 2016) and proposed KIPoS as a task for EVALITA 2020
the subsequent development of an Italian treebank (Basile et al., 2020) to address the issues involved
for social media (Sanguinetti et al., 2017; San- in the adaptation of a PoS tagger to the spe-
guinetti et al., 2018), where the issues related to cific features of oral text, in order to systemati-
a particularly challenging written text genre were cally represent those features and to provide the
addressed, KIPoS offers the opportunity of ad- mean to access to their specificities. We pro-
dressing the theoretical and methodological chal- vided therefore data for training (i.e. Development
lenges related to PoS tagging of Italian sponta- Set, henceforth DEVSET) and testing (Test Set,
neous speech texts. Carrying out this task means henceforth TESTSET) systems organized in two
processing a type of data that is known to be prob- ensembles which respectively represent formal
lematic for computational treatment, that is un- (DEVSET–formal and TESTSET–formal) and in-
planned spoken language (as opposed to experi- formal texts (DEVSET–informal and TESTSET–
mental speech data). PoS tagging of this corpus informal). This allowed us to consider one main
entails dealing with both a wide range of sponta- task and two subtasks, which are described as fol-
neous speech phenomena and a great amount of lows:
sociolinguistic variation.
The most challenging aspects to be addressed in • Main task - general: training on all given
the unconstrained speech of KIParla are: data (both DEVSET–formal and DEVSET–
informal) and testing on all test set data (both
• To identify mode-specific phenomena, such
TESTSET–formal and TESTSET–informal)
as repetitions, reformulations, fillers, incom-
plete syntactic structures, etc. • Subtask A - crossFormal: training on data
from DEVSET–formal only, and testing sep-
• To trace a relevant set of non-standard al-
arately on data from formal texts (TESTSET–
ternatives back to the same linguistic phe-
formal) and from informal texts (TESTSET–
nomenon (e.g. the presence of socio-
informal)
geographically marked forms like annà or
andà, equal to standard Italian andare ”to • Subtask B - crossInformal: training on
go”), either assigning them to the correct data from DEVSET–informal only, and test-
part-of-speech, or working out an ad-hoc so- ing separately on data from formal texts
lution. (TESTSET–formal) and from informal texts
(TESTSET–informal).
• To deal with different types of interaction and
registers (casual conversations, interviews, While all tasks are oriented to investigate how
office hours, etc.) with a variable number challenging can it be to PoS-tag spontaneous
of participants (1 to 5), each transcribed on speech data, the cross ones are especially useful
a separate line and corresponding to an au- for validating the hypothesis that some differences
tonomous text string. occur between the tagging of formal conversations
PoS-tagging of data from KIParla corpus is in- and that of informal conversations. As we will see
tended to bring an improvement to the current in section 5 and 6, this hypothesis is partially con-
practices in use for tagging and parsing spoken firmed by results. Some example useful to draw
Italian. Furthermore, this result is also signifi- the difference among the registers is provided in
cant for the purposes of (socio)linguistic research, the next section.
in that the availability of annotated spoken cor-
3 Datasets
pora enables the researcher to validate previous
assumptions based on smaller or less informa- All the data provided for the KIPoS task are ex-
tive datasets, but also to collect knowledge to be tracted from the KIP module (see Section 1),
Dataset Register Speakers Turns Tokens
DEVSET Formal 5 1.998 13.864
Informal 11 3.804 19.259
TESTSET Formal 2 459 3.642
Informal 2 582 3.532
Table 1: The sizes of the datset.
which includes various communicative situations BO003: cioe’ tra per i miei gusti tra il
occurring in the academic context. As explained gruppo
BO002: no eh
in detail in (Mauri et al., 2019), the recordings in- BO002: carino sia
volve five different types of interactions, each of BO002: di viso ma anche
which is assigned for the aims of KIPoS either to BO003: poi e’ anche il piu’ si’ si’ si’
e’ cornificatissimo non cornificato
the section of formal texts or to the section of in-
formal texts (mainly on the basis of the relation-
Both excerpts feature spontaneous speech phe-
ship between the participants, i.e. asymmetrical
nomena, such as fillers, repetitions and reformula-
vs. symmetrical).
tions. However, example 1 shows several charac-
The KIP corpus structure can thus be outlined as
teristics of formal styles, either cross-linguistically
follows:
shared (e.g. clausal subordination, passive con-
• Formal dataset: struction, abstract and specific terms) or language-
specific (e.g. existential construction with vi
– lessons as pre-copular proform); while example 2 dis-
– office hours plays various features which are typical of in-
– oral examinations formal styles, such as simple sentence structure
and pragmatically-marked word orders (e.g. il
• Informal dataset: più carino di tutti lo cornifichi), multi-functional
– semi-structured interviews words (e.g. carino), colloquialisms (e.g. povero
cristo, beccare, cornifichi, cornificato), elatives
– casual conversations.
(e.g. cornificatissimo), deictics (e.g. questo, lui)
Below are examples of formal (1) and informal (2) and discourse markers (e.g. cioè, scusa).
texts. All speakers were informed of the aims of the
project, agreed to the recording and signed a con-
(1)1 sent form.
BO088: una volta che carlo magno The set of data exploited for KIPoS precisely
conquisto’ l’italia fu permesso ad
anselmo di tornare eh a mantova
consists of around 200K tokens, corresponding to
BO088: nel settecentosettantaquattro approximately one-third of the whole KIParla cor-
BO088: ehme cosi’ po pote’ riprendere pus, with an equal proportion of informal and for-
la sua attivita’ prima eh di creazione
della biblioteca
mal speech data.
BO088: perche’ secondo appunto l’uso eh For the purposes of KIPoS, the UDpipe trained
delle biblioteche eh on all the treebanks available for Italian within the
BO088: medioev medievali diciamo prima
eh vi era Universal Dependencies repository3 has been ap-
BO088: mh la insomma la raccolta di plied on this 200K tokens portion of the KIParla
libri dall’esterno corpus. Among these data, approximately 30K
(2)2 tokens have been submitted to a careful manual
check and correction4 and released as training
BO003: povero cristo sono andata a
beccare questo sets of the KIPoS task (i.e. DEVSET–formal and
BO002: ma poi scusa il piu’ carino di
3
tutti lo cornifichi https://universaldependencies.org/it/
BO003: si’ si’ si’ esa poi secondo me index.html
4
lui e’ il piu’ carino di tutti We thank three students for their precious help: Filippo
Mulinacci, Martina Pittalis and Roberto Russo of the Depart-
1
KIP Corpus, BOC1001, oral examination ment of Modern Languages, Literatures and Cultures of the
2
KIP Corpus, BOA3001, casual conversation University of Bologna.
Team Affiliation
UniBO FICLIT – University of Bologna
UniBA University of Bari ”Aldo Moro”
KLUMSy Friedrich Alexander Universität Erlangen-Nürnberg & Universität Stuttgart
Table 2: The teams which participated to KIPoS and their affiliation.
DEVSET–informal). From the remaining auto-
matically annotated data, we extracted the formal- # conversation = BOD2018
# speaker = 3_AM_BO140
TESTSET and informal-TESTSET, and we also # turn = 3
manually checked and validated them. Finally, we # text = mh sı̀
released as a silver standard (i.e. SILVERSET) the 1 mh PARA
2 sı̀ INTJ
remaining data. They have been also made avail-
able together with the other data5 to be used for The format and the labels for tagging the part
training participants’ systems. of speech of the KIPoS data are compliant with
that provided in the Universal Dependencies Ital-
3.1 Annotation ian treebanks. Data were indeed released in a
As far as the annotation is concerned, for the CoNNL-U - like format, but which only includes
purpose of the task, the original orthographic the three first columns of it, separated by tab keys
transcriptions were provided in a tab-delimited as usually. For a detailed list and description of the
.txt format. Three are the main identifiers tagset used in KIPoS datasets, see the Appendix at
we used in this format, respectively indicating the end of this paper.
the conversation (alphanumeric), the speaker’s
ID (alphanumeric) and the position of the turn 3.2 Tokenization Issues
(numeric) within the context of the conversa- For what concerns words including multiple to-
tion. For instance, the example below in- kens, in the data released for the development and
cludes the first three turns of the conversation training of participant systems (DEVSET–formal
”BOD2018”6 , in which three different speakers and DEVSET–informal), we annotated their com-
are involved (”1 MP BO118”, ”2 MP BO118” pound and splitting both. See for instance, in the
and ”3 AM BO140”): first turn of the example above lines 2-3, 2 and 3: a
# conversation = BOD2018 verb with clitic suffix occurs and it is annotated as
# speaker = 1_MP_BO118
# turn = 1 a compound in line 2-3, while its components, i.e.
# text = dovresti parlarmi della tua casa the verb and the clitic, are separately annotated on
1 dovresti AUX line 2 and 3 respectively.
2-3 parlarmi VERB_PRON
2 parlar VERB In contrast, for the purpose of the evaluation, the
3 mi PRON format applied on the test set (TESTSET–formal
4-5 della ADP_A
4 di ADP
and TESTSET–informal) only includes a word for
5 la DET each line, regardless of the fact that a word may be
6 tua DET composed of more than one token. This makes the
7 casa NOUN
format of the test set slightly different from that
# conversation = BOD2018 used in the development data, but more compliant
# speaker = 2_MP_BO118 with the evaluation scripts and procedures. An ex-
# turn = 2
# text = attuale ample of this format follows, which consists in the
1 attuale ADJ first turn of the example above:
5
All the data annotate for KIPoS are available at https: # conversation = BOD2018
//github.com/boscoc/kipos2020, with the licence # speaker = 1_MP_BO118
and the annotation guidelines. # turn = 1
6
The alphanumeric code used to name the KIP’s con- # text = dovresti parlarmi della tua casa
versations provides information about the city in which the 1 dovresti AUX
the data has been collected (BO= Bologna, TO=Turin) and 2 parlarmi VERB_PRON
the kind of interaction (A1=office hours, A3=free conversa- 3 della ADP_A
tion, C1=exams, D1=lessons, D2=interviews). For example, 4 tua DET
BOD2018 is a semistructured interview recorded in Bologna. 5 casa NOUN
Task DEVSET TESTSET Team Score
Baseline (from POSTWITA) 0.9319
Main formal and informal formal UniBO 0.934880
KLUMSy 0.875629
UniBA 0.815819
informal UniBO 0.911316
KLUMSy 0.882368
UniBA 0.793684
Task A formal formal KLUMSy 0.873672
UniBA 0.787311
informal KLUMSy 0.875789
UniBA 0.757895
Task B informal formal KLUMSy 0.878144
UniBA 0.771101
informal KLUMSy 0.881053
UniBA 0.775000
Table 3: The official scores achieved by participants for the three subtasks (Main, Task A and Task
B), by training systems on both or one of the datasets provided for development (DEVSET–formal and
DEVSET–informal), on the TESTSET–formal and TESTSET–informal (best scores for each subtask in
bold face).
In this example, the verb with clitic suffix ”par- Speech tagging that is based on a pre-trained neu-
larmi” (speak to me) has been annotated as a com- ral language BERT-derived model (UmBERTo)
pound on a single line, i.e. line 2. and an adapted fine-tuning script.
KLUMSy used a tagger based on the averaged
4 Evaluation measures structured perceptron, which supports domain
For the KIPoS task a single measure has been used adaptation and can incorporate external resources
for the evaluation of participants’ runs, i.e. ac- for dealing with the limited availability of in-
curacy, which is defined as the number of correct domain data.
Part-of-Speech tags assignment divided by the to- The overall higher accuracy has been achieved
tal number of tokens in the gold TESTSET. The in the main task by the UniBO team on the
evaluation metric will be based on a token-by to- TESTSET-formal. The availability of a larger
ken comparison and only a single tag is allowed training corpus for the main task, which includes
for each token. the DEVSET–formal and the DEVSET–informal
The evaluation is performed in a black box ap- both, and the results calculated on both the por-
proach, where only the systems output is evalu- tions of the TESTSET allowed, as expected, the
ated. achievement of the KIPoS overall best score. This
is confirmed also by the fact that all teams pro-
5 Participation and Results
vided their best runs in it, for formal and informal
As depicted in table 3, where the main task and the register both. Even if the official submission of
two subtasks results are presented at glance, three UniBO did not include the runs for Task A and
teams submitted their runs for KIPoS (see table 2 B, the results it provided in its report (Tamburini,
for their affiliation). Nevertheless, one team par- 2020) show indeed that also this team has ranked
ticipated to the main task only, while the other two worst in Task A and B than in the main one. More
provided results for Task A and B too. precisely, for Task A, it achieved 0.8647 accuracy
The three teams applied different approaches. on TESTSET–formal and 0.8316 on TESTSET–
UniBA team used a combination of two taggers informal, while in Task B it achieved 0.8974
implementing two different approaches, namely on TESTSET–formal and 0.8952 on TESTSET–
stochastic Hidden Markov Model and rule-based. informal.
UniBO applied a fine-tuning approach to Part of As far as the other teams are concerned, UniBA
provided in its report (Izzi and Ferilli, 2020) also bit (0.0038) higher than that for the formal one.
the results achieved using a version of the TEST- Focusing on the cross subtasks A and B, we can
SET where a few errors detected after the official moreover notice that systems were not equally in-
evaluation has been fixed. This allowed a small fluenced by the type of data exploited for training:
improvement in their scores (e.g. in the main task, UniBO provided best scores against TESTSET–
+0.0078 for formal and +0.0056 for informal reg- formal also when trained on DEVSET–informal
ister). (Task B), while KLUMSy provided best scores
The KLUMSy team provided the best runs for against TESTSET–informal also when trained on
both registers in Task A and B, but in its runs, DEVSET–formal (Task A). UniBA seems instead
because of a misunderstanding of the guidelines slightly more influenced by the features of data
about the annotation of contractions in the TEST- used in training.
SET (which is slightly different with respect to the
DEVSET), a certain amount of mis-tagged tokens
occurred. After they were fixed, also the scores
6 Discussion and Conclusion
of this team were improved (with an increase that
varies from 0.0456 to 0.0187) with respect to the
official ones reported in table 3, as described in the The results described in this report can be only
report of this team (Proisl and Lapesa, 2020). considered as preliminary. First of all, KIPoS is
Considered that the PoS tagging is a task mostly the first edition of a task about PoS tagging of
solved, it is not surprising that the participants’ spontaneous speech for Italian and there aren’t
scores are quite high and close for all the tracks. other results about this kind of task for the same
The larger difference observed between the best language to be compared with. Second, the cor-
and the worst score is indeed 0.126, and it is re- pus used for KIPoS has been newly released for
ferred to Task B on TESTSET–formal. the purpose of the task and never used before. Par-
Given the peculiarity of oral text on which KIPoS ticipants provided some useful feedback about er-
is focused, it seems not especially meaningful a rors occurring in the DEVSET and TESTSET, but
comparison of our results with state-of-the-art Pos some further check should be applied for improv-
taggers results for the written standard language. ing the quality of data. Finally, only three partici-
A more interesting comparison can be instead de- pants submitted their runs (and only two provided
veloped with respect to the scores achieved within official runs for cross-genre tasks). Even if PoS
the PoSTWITA task (Bosco et al., 2016) on writ- tagging is among the tasks which are considered
ten texts extracted from social media. This genre as mostly solved in literature, only a larger partici-
is indeed often considered in between written and pation may allow a meaningful comparison among
oral, sharing some feature with the former and different approaches and results.
some with the latter. Using the best PoSTWITA
Nevertheless, the KIPoS task produced the valu-
task accuracy score (0.9319) as our baseline (see
able result of making available a novel resource for
table 3), we can observe that the best scores
the study of spoken Italian and for the advance-
achieved in KIPoS are in line with this result. This
ment of NLP in this area. It can be of great rel-
confirms the hypothesis that oral text can be con-
evance for the investigation of both spontaneous
sidered as almost equally hard to be morphologi-
speech phenomena and sociolinguistic variation,
cally tagged than social media.
but also e.g. in the development of chatbots and
As far as the distinction between formal and in- vocal recognition systems.
formal conversation drawn in the KIPoS datasets In particular, the insights gained within the con-
is concerned, a general trend of better scoring in text of this Evalita evaluation campaign for PoS
formal data tagging can be observed, but some tagging can pave the way for further investigating
meaningful difference among participant systems actual speech data. They provide a solid founda-
occurs. For all subtasks UniBO best scored in for- tion for our future research also in the direction
mal text, while KLUMSy did the same in infor- of more detailed morphological analysis and syn-
mal data. UniBA achieved instead its best scores tactic parsing, especially within the framework of
on TESTSET–formal with the exception of Task B Universal Dependencies where we would like to
where its score for the informal test set is a little release the KIPoS dataset in the near future.
7 Acknowledgments Daniela Mereu. 2019. Il sardo parlato a Cagliari.
Franco Angeli, Milano.
The construction of part of the corpus has been
possible thanks to the financing of the Fondazione Thomas Proisl and Gabriella Lapesa. 2020.
KLUMSy@KIPoS: Experiments on Part-of-Speech
CRT under the Erogazioni ordinarie 2018 pro-
Tagging of Spoken Italian. In Valerio Basile, Danilo
gram. The KIParla corpus has been made pos- Croce, Maria Di Maro, and Lucia C. Passaro, edi-
sibile thanks to SIR Project ’LEAdHOC’ (n. tors, Proceedings of the 7th evaluation campaign of
RBSI14IIG0), funded by MIUR. We would like Natural Language Processing and Speech tools for
to thank also the students from our BA and MA Italian (EVALITA 2020), Online. CEUR.org.
courses at the Universities of Bologna and Torino, Manuela Sanguinetti, Cristina Bosco, Alessandro
who participated in collecting and transcribing the Mazzei, Alberto Lavelli, and Fabio Tamburini.
data. 2017. Annotating Italian social media texts in Uni-
versal Dependencies. In Proceedings of the Fourth
International Conference on Dependency Linguis-
tics (Depling 2017), pages 229–239.
References
Giovanna Alfonzetti. 2002. La relativa non-standard. Manuela Sanguinetti, Cristina Bosco, Alberto Lavelli,
Italiano popolare o italiano parlato? Centro di Alessandro Mazzei, and Fabio Tamburini. 2018.
Studi Filologici e Linguistici Siciliani, Palermo. PoSTWITA-UD: an Italian Twitter Treebank in Uni-
versal Dependencies. In Proceedings of the 11th
Valerio Basile, Danilo Croce, Maria Di Maro, and Lu- Language Resources and Evaluation Conference
cia C. Passaro. 2020. EVALITA 2020: Overview of (LREC 2018), pages 1768–1775.
the 7th Evaluation Campaign of Natural Language
Processing and Speech Tools for Italian. In Valerio Fabio Tamburini. 2020. UniBO@KIPoS: Fine-tuning
Basile, Danilo Croce, Maria Di Maro, and Lucia C. the Italian “BERTology” for PoS-tagging Spoken
Passaro, editors, Proceedings of Seventh Evalua- Data. In Valerio Basile, Danilo Croce, Maria
tion Campaign of Natural Language Processing and Di Maro, and Lucia C. Passaro, editors, Proceedings
Speech Tools for Italian. Final Workshop (EVALITA of the 7th evaluation campaign of Natural Language
2020), Online. CEUR.org. Processing and Speech tools for Italian (EVALITA
2020), Online. CEUR.org.
Cristina Bosco, Fabio Tamburini, Andrea Bolioli, and
Alessandro Mazzei. 2016. Overview of the
EVALITA 2016 Part Of Speech on TWitter for ITAl-
ian task. In Proceedings of Evalita 2016.
Cristina Bosco, Silvia Ballarè, Massimo Cerruti, Eu-
genio Goria, and Caterina Mauri. 2020. KIPoS @
EVALITA2020: Overview of the Task on KIParla
Part of Speech tagging. In Valerio Basile, Danilo
Croce, Maria Di Maro, and Lucia C. Passaro, edi-
tors, Proceedings of Seventh Evaluation Campaign
of Natural Language Processing and Speech Tools
for Italian. Final Workshop (EVALITA 2020), On-
line. CEUR.org.
Giovanni Luca Izzi and Stefano Ferilli. 2020.
UniBA@KIPoS: A Hybrid Approach for Part-of-
Speech Tagging. In Valerio Basile, Danilo Croce,
Maria Di Maro, and Lucia C. Passaro, editors, Pro-
ceedings of the 7th evaluation campaign of Natural
Language Processing and Speech tools for Italian
(EVALITA 2020), Online. CEUR.org.
Per Linell. 2005. The written language bias in linguis-
tics: its nature, origins and transformations. Rout-
ledge, London – New York.
Caterina Mauri, Silvia Ballarè, Eugenio Goria, Mas-
simo Cerruti, and Francesco Suriano. 2019. KIParla
Corpus: A New Resource for Spoken Italian. In
Raffaella Bernardi, Roberto Navigli, and Giovanni
Semeraro, editors, Proceedings of the 6th Italian
Conference on Computational Linguistics (CLiC-it
2019), Online. CEUR.org.
APPENDIX: The KIPoS tagset
Tag Value(s) Examples
ADJ • Qualifying, numeral, possessive adjectives una bella casa
• Interrogative adjectives quanti anni hai?
• Adjectives used as pro-forms -ci vediamo domani? -esatto
ADP • Prepositions di, a, da, senza te, tranne, ...
• Pospositions vent’anni fa
ADP A • Articled prepositions dalla, nella, sulla, ...
ADV • Adverbs lo metto qui
• Interrogative adverbs non ricordo come si chiama
AUX • Auxiliaries essere, avere
• Modals potere, volere, dovere
• Periphrastic auxiliaries sta mangiando, viene visto, ...
CCONJ • Coordinating conjunctions e, ma, o, però, anzi, quindi,
• Discourse markers with predominantly connective dunque, ...
function
DET • Articles ho visto un film
• Demonstratives la senti questa voce?
• Numerals ho giocato tre numeri al lotto
• Possessives non nominare miasorella
• Quantifiers alcuni studenti sono assenti
DIA • Italo-Romance dialects c’erano due fiulin
INTJ • Interjections sı̀, no, ecco, ...
LIN • Languages other than Italian vi saluto guys
NEG • Sentence negation non
NOUN • Nouns of any type except proper nouns ho visto un re
NUM • Numbers (but not numeral adjectives) - quanti sono? -tre
PARA • Paraverbal communication eh, mh, oh, bla bla, . . .
PRON • Personal and reflexive pronouns io, me, tu, te, sé, ...
• Interrogative pronouns chi?, cosa?, quale?, che?
• Relative pronouns il quale, dove, cui
PROPN • Proper nouns Gigi
SCONJ • Subordinating conjunctions dove, quando, perché
ho detto che. . .
se vuoi
VERB • Verbs aveva vent’anni
era molto stanco
VERB PRON • Verb + clitic pronoun cluster mangiarlo, donarglielo, . . .
X • Other (e.g. truncated words) fior-