=Paper= {{Paper |id=Vol-1749/paper_007 |storemode=property |title=Overview of the EVALITA 2016 Named Entity rEcognition and Linking in Italian Tweets (NEEL–IT) Task |pdfUrl=https://ceur-ws.org/Vol-1749/paper_007.pdf |volume=Vol-1749 |authors=Pierpaolo Basile,Annalina Caputo,Anna Lisa Gentile,Giuseppe Rizzo |dblpUrl=https://dblp.org/rec/conf/clic-it/BasileCGR16 }} ==Overview of the EVALITA 2016 Named Entity rEcognition and Linking in Italian Tweets (NEEL–IT) Task== https://ceur-ws.org/Vol-1749/paper_007.pdf
Overview of the EVALITA 2016 Named Entity rEcognition and Linking in
                   Italian Tweets (NEEL-IT) Task
Pierpaolo Basile1 and Annalina Caputo2 and Anna Lisa Gentile3 and Giuseppe Rizzo4
      1
        Department of Computer Science, University of Bari Aldo Moro, Bari (Italy)
                 2
                   ADAPT Centre, Trinity Collge Dublin, Dublin (Ireland)
                      3
                        University of Mannheim, Mannheim (Germany)
                        4
                          Istituto Superiore Mario Boella, Turin (Italy)
                             1
                               pierpaolo.basile@uniba.it
                        2
                          annalina.caputo@adaptcentre.ie
                   3
                     annalisa@informatik.uni-mannheim.de
                                4
                                  giuseppe.rizzo@ismb.it

                   Abstract                         1    Introduction
                                                    Tweets represent a great wealth of information
  English. This report describes the main
                                                    useful to understand recent trends and user be-
  outcomes of the 2016 Named Entity
                                                    haviours in real-time. Usually, natural language
  rEcognition and Linking in Italian Tweet
                                                    processing techniques would be applied to such
  (NEEL-IT) Challenge. The goal of the
                                                    pieces of information in order to make them
  challenge is to provide a benchmark cor-
                                                    machine-understandable. Named Entity rEcongi-
  pus for the evaluation of entity recog-
                                                    tion and Linking (NEEL) is a particularly useful
  nition and linking algorithms specifically
                                                    technique aiming aiming to automatically anno-
  designed for noisy and short texts, like
                                                    tate tweets with named entities. However, due to
  tweets, written in Italian. The task re-
                                                    the noisy nature and shortness of tweets, this tech-
  quires the correct identification of entity
                                                    nique is more challenging in this context than else-
  mentions in a text and their linking to
                                                    where. International initiatives provide evaluation
  the proper named entities in a knowledge
                                                    frameworks for this task, e.g. the Making Sense of
  base. To this aim, we choose to use the
                                                    Microposts workshop (Dadzie et al., 2016) hosted
  canonicalized dataset of DBpedia 2015-
                                                    the 2016 NEEL Challenge (Rizzo et al., 2016), or
  10. The task has attracted five participants,
                                                    the W-NUT workshop at ACL 2015 (Baldwin et
  for a total of 15 runs submitted.
                                                    al., 2015), but the focus is always and strictly on
  Italiano. In questo report descriviamo            the English language. We see an opportunity to
  i principali risultati conseguiti nel primo       (i) encourage the development of language inde-
  task per la lingua Italiana di Named Entity       pendent tools for for Named Entity Recognition
  rEcognition e Linking in Tweet (NEEL-             (NER) and Linking (NEL) systems and (ii) estab-
  IT). Il task si prefigge l’obiettivo di offrire   lish an evaluation framework for the Italian com-
  un framework di valutazione per gli algo-         munity. NEEL-IT at EVALITA has the vision to
  ritmi di riconoscimento e linking di entità       establish itself as a reference evaluation frame-
  a nome proprio specificamente disegnati           work in the context of Italian tweets.
  per la lingua italiana per testi corti e ru-      2    Task Description
  morosi, quali i tweet. Il task si compone di
  una fase di riconoscimento delle menzioni         NEEL-IT followed a setting similar to NEEL chal-
  di entità con nome proprio nel testo e del        lenge for English Micropost on Twitter (Rizzo et
  loro successivo collegamento alle oppor-          al., 2016). The task consists of annotating each
  tune entità in una base di conoscenza. In         named entity mention (like people, locations, or-
  questo task abbiamo scelto come base di           ganizations, and products) in a text by linking it to
  conoscenza la versione canonica di DBpe-          a knowledge base (DBpedia 2015-10).
  dia 2015. Il task ha attirato cinque parte-          Specifically, each task participant is required to:
  cipanti per un totale di 15 diversi run.              1. Recognize and typing each entity mention
                                                           that appears in the text of a tweet;
                                            Table 1: Example of annotations.
         id        begin     end     link                                                           type
         288...    0         18      NIL                                                            Product
         288...    73        86      http://dbpedia.org/resource/Samsung_Galaxy_Note_II             Product
         288...    89        96      http://dbpedia.org/resource/Nexus_4                            Product
         290...    1         15      http://dbpedia.org/resource/Carlotta_Ferlito                   Person


  2. Disambiguate and link each mention to the                      bodies, press names, public organizations,
     canonicalized DBpedia 2015-10, which is                        collection of people;
     used as referent Knowledge Base. This
     means that if an entity is present in the Ital-           Person people’s names;
     ian DBpedia but not in the canonicalized ver-             Product movies, tv series, music albums, press
     sion, this mention should be tagged as NIL.                   products, devices.
     For example, the mention Agorà can only
     be referenced to the Italian DBpedia entry                   From the annotation are excluded the preceding
     Agorà 1 , but this en-              article (like il, lo, la, etc.) and any other prefix
     try has no correspondence into the canonical-             (e.g. Dott., Prof.) or post-posed modifier. Each
     ized version of DBpedia. Then, it has been                participant is asked to produce an annotation file
     tagged as a NIL instance.                                 with multiple lines, one for each annotation. A
                                                               line is a tab separated sequence of tweet id, start
  3. Cluster together the non linkable entities,               offset, end offset, linked concept in DBpedia, and
     which are tagged as NIL, in order to provide              category. For example, given the tweet with id
     a unique identifier for all the mentions that             288976367238934528:
     refer to the same named entity.
                                                                Chameleon Launcher in arrivo anche per smart-
In the annotation process, a named entity is a
                                                                   phone: video beta privata su Galaxy Note 2
string in the tweet representing a proper noun that:
                                                                   e Nexus 4: Chameleon Laun...
1) belongs to one of the categories specified in a
taxonomy and/or 2) can be linked to a DBpedia                     the annotation process is expected to produce
concept. This means that some concepts have a                  the output as reported in Table 1.
NIL DBpedia reference2 .                                          The annotation process is also expected to link
   The taxonomy is defined by the following cate-              Twitter mentions (@) and hashtags (#) that re-
gories:                                                        fer to a named entities, like in the tweet with id
                                                               290460612549545984:
Thing languages, ethnic groups, nationalities, re-
    ligions, diseases, sports, astronomical ob-                 @CarlottaFerlito io non ho la forza di alzarmi e
    jects;                                                        prendere il libro! Help me
Event holidays, sport events, political events, so-            the correct annotation is also reported in Table 1.
    cial events;                                                 Participants were allowed to submit up to three
                                                               runs of their system as TSV files. We encourage
Character fictional character, comics character,
                                                               participants to make available their system to the
   title character;
                                                               community to facilitate reuse.
Location public places, regions, commercial
    places, buildings;                                         3   Corpus Description and Annotation
                                                                   Process
Organization companies, subdivisions of com-
    panies, brands, political parties, government              The NEEL-IT corpus consists of both a develop-
                                                               ment set (released to participants as training set)
   1
     http://it.dbpedia.org/resource/                           and a test set. Both sets are composed by two
AgorÃă\_(programma\_televisivo)
   2
     These concepts belong to one of the categories but they   TSV files: (1) the tweet id file, this is a list of all
have no corresponding concept in DBpedia                       tweet ids used for training; (2) the gold standard,
containing the annotations for all the tweets in the
                                                                    Table 2: Datasets Statistics.
development set following the format showed in
                                                          Stat.                     Dev. Set Test Set
Table 1.
   The development set was built upon the dataset         # tweets                     1,000        301
produced by Basile et al. (2015). This dataset is         # tokens                    14,242      4,104
composed by a sample of 1,000 tweets randomly             # hashtags                     250        108
selected from the TWITA dataset (Basile and Nis-          # mentions                     624        181
sim, 2013). We updated the gold standard links            Mean token per tweet         14.24      13.65
to the canonicalized DBpedia 2015-10. Further-            # NIL Thing                     14          3
more, the dataset underwent another round of an-          # NIL Event                      9          7
notation performed by a second annotator in order         # NIL Character                  4          5
to maximize the consistency of the links. Tweets          # NIL Location                   6          9
that presented some conflicts were then resolved          # NIL Organization              49         19
by a third annotator.                                     # NIL Person                   150         76
   Data for the test set was generated by randomly        # NIL Product                   43         12
selecting 1,500 tweets from the SENTIPOLC test            # Thing                          6          0
data (Barbieri et al., 2016). From this pool, 301         # Event                          6         12
tweets were randomly chosen for the annotation            # Character                     12          2
process and represents our Gold Standard (GS).            # Location                     116         70
This sub-sample was choose in coordination with           # Organization                 148         56
the task organisers of SENTIPOLC (Barbieri et             # Person                       173         61
al., 2016), POSTWITA (Tamburini et al., 2016)             # Product                       65         25
and FacTA (Minard et al., 2016b) with the aim of          # NIL instances                275        131
providing a unified framework for multiple layers         # Entities                     526        357
of annotations.
   The tweets were split in two batches, each of
them was manually annotated by two different an-              for all annotations considering the mention
notators. Then, a third annotator intervened in or-           boundaries and their types. This is a measure
der to resolve those debatable tweets with no exact           of the tagging capability of the system.
match between annotations. The whole process
has been carried out by exploiting BRAT3 web-          SLM (Strong_Link_Match). This metrics is the
based tool (Stenetorp et al., 2012).                      micro average F-1 score for annotations con-
   Table 2 reports some statistics on the two sets:       sidering the correct link for each mention.
in both the most represented categories are “Per-         This is a measure of the linking performance
son”, “Organization” and “Location”. “Person”             of the system.
is also the most populated category among the          MC (Mention_Ceaf ). This metrics, also known
NIL instances, along to “Organization” and “Prod-         as Constrained Entity-Alignment F-measure
uct”. In the development set, the least represented       (Luo, 2005), is a clustering metric developed
category is “Character” among the NIL instances           to evaluate clusters of annotations. It evalu-
and both “Thing” and “Event” between the linked           ates the F-1 score for both NIL and non-NIL
ones. A different behaviour can be found in the           annotations in a set of mentions.
test set where the least represented category is
“Thing” in both NIL and linked instances.                 The final score for each system is a combination
                                                       of the aforementioned metrics and is computed as
4       Evaluation Metrics                             follows:
Each participant was asked to submit up to three
different run. The evaluation is based on the fol-
                                                       score = 0.4×M C +0.3×ST M M +0.3×SLM.
lowing three metrics:
                                                                                                (1)
STMM (Strong_Typed_Mention_Match). This                  All the metrics were computed by using the
   metrics evaluates the micro average F-1 score       TAC KBP scorer4 .
    3                                                    4
        http://brat.nlplab.org/                              https://github.com/wikilinks/neleval/
5     Systems Description                                 Rule-based and supervised (SVM-based) tech-
                                                          niques are investigated to merge annotations from
The task was well received by the NLP commu-              different tools and solve possible conflicts. All the
nity and was able to attract 17 participants who          resources listed as follows were employed in the
expressed their interest in the evaluation. Five          evaluation:
groups participated actively to the challenge by
submitting their system results, each group pre-            • The Wiki Machine (Palmero Aprosio and
sented three different runs, for a total amount of            Giuliano, 2016): an open source entity link-
15 runs submitted. In this section we briefly de-             ing for Wikipedia and multiple languages.
scribe the methodology followed by each group.
                                                            • Tint (Palmero Aprosio and Moretti, 2016): an
5.1    UniPI                                                  open source suite of NLP modules for Italian,
                                                              based on Stanford CoreNLP, which supports
The system proposed by the University of Pisa
                                                              named entity recognition.
(Attardi et al., 2016) exploits word embeddings
and a bidirectional LSTM for entity recognition             • Social Media Toolkit (SMT) (Nechaev et al.,
and linking. The team produced also a training                2016): a resource and API supporting the
dataset of about 13,945 tweets for entity recog-              alignment of Twitter user profiles to the cor-
nition by exploiting active learning, training data           responding DBpedia entities.
taken from the PoSTWITA task (Tamburini et al.,
2016) and manual annotation. This resource, in              • Twitter ReST API5 : a public API for retriev-
addition to word embeddings built on a large cor-             ing Twitter user profiles and tweet metadata.
pus of Italian tweets, is used to train a bidirectional
LSTM for the entity recognition step. In the link-          • Morph-It! (Zanchetta and Baroni, 2005): a
ing step, for each Wikipedia page its abstract is ex-         free morphological resource for Italian used
tracted and the average of the word embeddings is             for preprocessing (true-casing) and as source
computed. For each candidate entity in the tweet,             of features for the supervised merging of an-
the word embedding for a context of words of size             notations.
c before and after the entity is created. The link-         • tagdef6 : a website collecting user-contributed
ing is performed by comparing the mention em-                 descriptions of hashtags.
bedding with the DBpedia entity whose lc2 dis-
tance is the smallest among those entities whose            • list of slang terms from Wikipedia7 .
abstract embeddings were computed at the previ-
ous step. The Twitter mentions were resolved by           5.3    FBK-HLT-NLP
retrieving the real name with the Twitter API and         The system proposed by the FBK-HLT-NLP team
looking up in a gazetteer in order to identify the        (Minard et al., 2016a) follows 3 steps: entity
Person-type entities.                                     recognition and classification, entity linking to
                                                          DBpedia and clustering. Entity recognition and
5.2    MicroNeel                                          classification is performed by the EntityPro mod-
MicroNeel (Corcoglioniti et al., 2016) investi-           ule (included in the TextPro pipeline), which is
gates the use on microposts of two standard               based on machine learning and uses the SVM al-
NER and Entity Linking tools originally devel-            gorithm. Entity linking is performed using the
oped for more formal texts, namely Tint (Palmero          named entity disambiguation module developed
Aprosio and Moretti, 2016) and The Wiki Ma-               within the NewsReader and based on DBpedia
chine (Palmero Aprosio and Giuliano, 2016).               Spotlight. The FBK team exploited a specific re-
Comprehensive tweet preprocessing is performed            source to link the Twitter profiles to DBpedia: the
to reduce noisiness and increase textual context.         Alignments dataset. The clustering step is string-
Existing alignments between Twitter user pro-             based, i.e. two entities are part of the same cluster
files and DBpedia entities from the Social Media          if they are equal.
Toolkit (Nechaev et al., 2016) resource are ex-              5
                                                              https://dev.twitter.com/rest/public
ploited to annotate user mentions in the tweets.             6
                                                              https://www.tagdef.com/
                                                            7
                                                              https://it.wikipedia.org/wiki/Gergo_
wiki/Evaluation                                           di_Internet
   Moreover, the FBK team exploits active learn-      configuration the model has been induced by ex-
ing for domain adaptation, in particular to adapt     ploiting several gazetteers, i.e. products, organiza-
a general purpose Named Entity Recognition sys-       tions, persons, events and characters. Two strate-
tem to a specific domain (tweets) by creating new     gies are adopted for the linking. A decision strat-
annotated data. In total they have annotated 2,654    egy is used to select the best link by exploiting a
tweets.                                               large set of supervised methods. Then, word em-
                                                      beddings built on Wikipedia are used to compute a
5.4   Sisinflab                                       similarity measure used to select the best link for
The system proposed by Sisinflab (Cozza et al.,       a list of candidate entities. NIL clustering is per-
2016) faces the neel-it challenge through an en-      formed by a graph-based approach; in particular,
samble approach that combines unsupervised and        a weighted indirect co-occurrence graph where an
supervised methods. The system merges results         edge represents the co-occurrence of two terms in
achieved by three strategies:                         a tweet is built. The ensuing word graph was then
                                                      clustered using the MaxMax algorithm.
  1. DBpedia Spotlight for span and URI detec-
     tion plus SPARQL queries to DBpedia for          6   Results
     type detection;
                                                      The performance of the participant systems were
                                                      assessed by exploiting the final score measure pre-
  2. Stanford CRF-NER trained with the chal-
                                                      sented in Eq. 1. This measure combines the
     lenge train corpus for span and type detection
                                                      three different aspects evaluated during the task,
     and DBpedia lookup for URI detection;
                                                      i.e. the correct tagging of the mentions (STMM),
                                                      the proper linking to the knowledge base (SLM),
  3. DeepNL-NER, a deep learning classifier
                                                      and the clustering of the NIL instances (MC). Re-
     trained with the challenge train corpus for
                                                      sults of the evaluation in terms of the final score
     span and type detection, it exploits ad-hoc
                                                      are reported in Table 3.
     gazetteers and word embedding vectors com-
     puted with word2vec trained over the Twita          The best result was reported by Uni.PI.3, this
     dataset8 (a subset of 12,000,000 tweets). DB-    system obtained the best final score of 0.5034
     pedia is used for URI detection.                 with an improvement with respect to the Uni.PI.1
                                                      (second classified) of +1.27. The difference be-
Finally, the system computes NIL clusters for         tween these two runs lays on the different vec-
those mentions that do not match with an entry        tor dimension (200 in Uni.PI.3 rather than 100
in DBpedia, by grouping in the same cluster en-       in Uni.Pi.1) combined with the use of Wikipedia
tities with the same text (no matter the case). The   embeddings and a specific training set for geo-
Sisinflab team submitted three runs combining the     graphical entities (Uni.PI.3) rather than a mention
previous strategies, in particular: run1) combines    frequency strategy for disambiguation (Uni.PI.1).
(1), (2) and (3); run2 involves strategies (1) and    MicroNeel.base and FBK-HLT-NLP obtain re-
(3); run3 exploits strategies (1) and (2).            markable results very close to the best system.
                                                      Indeed, MicroNeel.base reported the highest link-
5.5   UNIMIB                                          ing performance (SLM = 0.477) while FBK-HLT-
                                                      NLP showed the best clustering (MC = 0.585) and
The system proposed by the UNIMIB team (Cec-          tagging (STMM = 0.516) results. It is interest-
chini et al., 2016) is composed of three steps: 1)    ing to notice that all these systems (UniPI, Mi-
Named Entity Recognition using Conditional Ran-       croNeel and FBK-HLT-NLP) developed specific
dom Fields (CRF); 2) Named Entity Linking by          techniques for dealing with Twitter mentions re-
considering both Supervised and Neural-Network        porting very good results for the tagging metric
Language models and 3) NIL clustering by us-          (with values always above 0.46).
ing a graph-based approach. In the first step two
                                                         All participants have made used of super-
kinds of CRF are exploited: 1) a simple CRF on
                                                      vised algorithms at some point of their tag-
the training data and 2) CRF+Gazetteers, in this
                                                      ging/linking/clustering pipeline. UniPi, Sisin-
  8
    http://www.let.rug.nl/basile/files/               flab and UNIMIB have exploited word embed-
proc/                                                 dings trained on the development set plus some
other external resources (manual annotated cor-         References
pus, Wikipedia, and Twita). UniPI and FBK-HLT-          Giuseppe Attardi, Daniele Sartiano, Maria Simi, and
NLP built additional training data obtained by ac-        Irene Sucameli. 2016. Using Embeddings for
tive learning and manual annotation. The use of           Both Entity Recognition and Linking in Tweets. In
additional resources is allowed by the task guide-        Pierpaolo Basile, Anna Corazza, Franco Cutugno,
                                                          Simonetta Montemagni, Malvina Nissim, Viviana
lines, and both the teams have contributed to de-         Patti, Giovanni Semeraro, and Rachele Sprugnoli,
velop additional data useful for the research com-        editors, Proceedings of Third Italian Conference on
munity.                                                   Computational Linguistics (CLiC-it 2016) & Fifth
                                                          Evaluation Campaign of Natural Language Pro-
                                                          cessing and Speech Tools for Italian. Final Work-
7   Conclusions                                           shop (EVALITA 2016). Associazione Italiana di Lin-
                                                          guistica Computazionale (AILC).
We described the first evaluation task for entity       Timothy Baldwin, Young-Bum Kim, Marie Cather-
linking in Italian tweets. The task evaluated the         ine de Marneffe, Alan Ritter, Bo Han, and Wei
performance of participant systems in terms of (1)        Xu. 2015. Shared tasks of the 2015 workshop on
                                                          noisy user-generated text: Twitter lexical normaliza-
tagging entity mentions in the text of tweets; (2)        tion and named entity recognition. ACL-IJCNLP,
linking the mentions with respect to the canoni-          126:2015.
calized DBpedia 2015-10; (3) clustering the entity
mentions that refer to the same named entity.           Francesco Barbieri, Valerio Basile, Danilo Croce,
                                                          Malvina Nissim, Nicole Novielli, and Viviana Patti.
   The task has attracted many participants who           2016. Overview of the EVALITA 2016 SENTi-
specifically designed and developed algorithm for         ment POLarity Classification Task. In Pierpaolo
dealing with both Italian language and the specific       Basile, Anna Corazza, Franco Cutugno, Simonetta
                                                          Montemagni, Malvina Nissim, Viviana Patti, Gio-
peculiarity of text on Twitter. Indeed, many par-         vanni Semeraro, and Rachele Sprugnoli, editors,
ticipants developed ad-hoc techniques for recog-          Proceedings of Third Italian Conference on Compu-
nising Twitter mentions and hashtag. In addition,         tational Linguistics (CLiC-it 2016) & Fifth Evalua-
the participation in the task has fostered the build-     tion Campaign of Natural Language Processing and
                                                          Speech Tools for Italian. Final Workshop (EVALITA
ing of new annotated datasets and corpora for the         2016). Associazione Italiana di Linguistica Com-
purpose of training learning algorithms and word          putazionale (AILC).
embeddings.
                                                        Valerio Basile and Malvina Nissim. 2013. Sentiment
   We hope that this first initiative has set up the      analysis on italian tweets. In Proceedings of the 4th
scene for further investigations and developments         Workshop on Computational Approaches to Subjec-
of best practises, corpora and resources for the          tivity, Sentiment and Social Media Analysis, pages
Italian name entity linking on Tweets and other           100–107, Atlanta, Georgia, June. Association for
                                                          Computational Linguistics.
microblog contents.
   As future work, we plan to build a bigger dataset    Pierpaolo Basile, Annalina Caputo, and Giovanni Se-
                                                           meraro. 2015. Entity Linking for Italian Tweets.
of annotated contents and to foster the release of         In Cristina Bosco, Sara Tonelli, and Fabio Massimo
state-of-the-art methods for entity linking in Ital-       Zanzotto, editors, Proceedings of the Second Ital-
ian language.                                              ian Conference on Computational Linguistics CLiC-
                                                           it 2015, Trento, Italy, December 3-8, 2015., pages
                                                           36–40. Accademia University Press.
Acknowledgments
                                                        Flavio Massimiliano Cecchini, Elisabetta Fersini,
                                                           Enza Messina Pikakshi Manchanda, Debora Nozza,
This work is supported by the project “Multi-              Matteo Palmonari, and Cezar Sas.             2016.
lingual Entity Liking” co-funded by the Apulia             UNIMIB@NEEL-IT : Named Entity Recognition
Region under the program FutureInResearch, by              and Linking of Italian Tweets. In Pierpaolo Basile,
                                                           Anna Corazza, Franco Cutugno, Simonetta Mon-
the ADAPT Centre for Digital Content Technol-              temagni, Malvina Nissim, Viviana Patti, Giovanni
ogy, which is funded under the Science Founda-             Semeraro, and Rachele Sprugnoli, editors, Pro-
tion Ireland Research Centres Programme (Grant             ceedings of Third Italian Conference on Computa-
13/RC/2106) and is co-funded under the Euro-               tional Linguistics (CLiC-it 2016) & Fifth Evalua-
                                                           tion Campaign of Natural Language Processing and
pean Regional Development Fund, and by H2020               Speech Tools for Italian. Final Workshop (EVALITA
FREME project (GA no. 644771).                             2016). Associazione Italiana di Linguistica Com-
                                                           putazionale (AILC).
Table 3:     Results of the evaluation with respect to:           MC (Mention_Ceaf ), STMM
(Strong_Typed_Mention_Match), SLM (Strong_Link_Match) and the final score used for system
ranking. ∆ shows the final score improvement of the current system versus the previous. Best MC,
STMM and SLM are reported in bold.
                   name                     MC     STMM        SLM     final score        ∆
                   UniPI.3                0.561       0.474    0.456       0.5034     +1.27
                   UniPI.1                0.561       0.466    0.443       0.4971     +0.08
                   MicroNeel.base         0.530       0.472    0.477       0.4967     +0.10
                   UniPI.2                0.561       0.463    0.443       0.4962     +0.61
                   FBK-HLT-NLP.3          0.585       0.516    0.348       0.4932     +0.78
                   FBK-HLT-NLP.2          0.583       0.508    0.346       0.4894     +1.49
                   FBK-HLT-NLP.1          0.574       0.509    0.333       0.4822     +1.49
                   MicroNeel.merger       0.509       0.463    0.442       0.4751     +0.32
                   MicroNeel.all          0.506       0.460    0.444       0.4736    +38.56
                   sisinflab.1            0.358       0.282    0.380       0.3418      0.00
                   sisinflab.3            0.358       0.286    0.376       0.3418     +2.24
                   sisinflab.2            0.340       0.280    0.381       0.3343    +50.31
                   unimib.run_02          0.208       0.194    0.270       0.2224     +9.50
                   unimib.run_03          0.207       0.188    0.213       0.2031     +5.56
                   unimib.run_01          0.193       0.166    0.218       0.1924      0.00


Francesco Corcoglioniti, Alessio Palmero Aprosio,             ence on Human Language Technology and Empiri-
  Yaroslav Nechaev, and Claudio Giuliano. 2016. Mi-           cal Methods in Natural Language Processing, pages
  croNeel: Combining NLP Tools to Perform Named               25–32. Association for Computational Linguistics.
  Entity Detection and Linking on Microposts. In
  Pierpaolo Basile, Anna Corazza, Franco Cutugno,         Anne-Lyse Minard, R. H. Mohammed Qwaider, and
  Simonetta Montemagni, Malvina Nissim, Viviana             Bernardo Magnini. 2016a. FBK-NLP at NEEL-
  Patti, Giovanni Semeraro, and Rachele Sprugnoli,          IT: Active Learning for Domain Adaptation. In
  editors, Proceedings of Third Italian Conference on       Pierpaolo Basile, Anna Corazza, Franco Cutugno,
  Computational Linguistics (CLiC-it 2016) & Fifth          Simonetta Montemagni, Malvina Nissim, Viviana
  Evaluation Campaign of Natural Language Pro-              Patti, Giovanni Semeraro, and Rachele Sprugnoli,
  cessing and Speech Tools for Italian. Final Work-         editors, Proceedings of Third Italian Conference on
  shop (EVALITA 2016). Associazione Italiana di Lin-        Computational Linguistics (CLiC-it 2016) & Fifth
  guistica Computazionale (AILC).                           Evaluation Campaign of Natural Language Pro-
                                                            cessing and Speech Tools for Italian. Final Work-
Vittoria Cozza, Wanda La Bruna, and Tommaso                 shop (EVALITA 2016). Associazione Italiana di Lin-
   Di Noia. 2016. sisinflab: an ensemble of su-             guistica Computazionale (AILC).
   pervised and unsupervised strategies for the neel-it   Anne-Lyse Minard, Manuela Speranza, and Tommaso
   challenge at Evalita 2016. In Pierpaolo Basile, Anna     Caselli. 2016b. The EVALITA 2016 Event Factual-
   Corazza, Franco Cutugno, Simonetta Montemagni,           ity Annotation Task (FactA). In Pierpaolo Basile,
   Malvina Nissim, Viviana Patti, Giovanni Semer-           Anna Corazza, Franco Cutugno, Simonetta Mon-
   aro, and Rachele Sprugnoli, editors, Proceedings of      temagni, Malvina Nissim, Viviana Patti, Giovanni
   Third Italian Conference on Computational Linguis-       Semeraro, and Rachele Sprugnoli, editors, Pro-
   tics (CLiC-it 2016) & Fifth Evaluation Campaign          ceedings of Third Italian Conference on Computa-
   of Natural Language Processing and Speech Tools          tional Linguistics (CLiC-it 2016) & Fifth Evalua-
   for Italian. Final Workshop (EVALITA 2016). As-          tion Campaign of Natural Language Processing and
   sociazione Italiana di Linguistica Computazionale        Speech Tools for Italian. Final Workshop (EVALITA
   (AILC).                                                  2016). Associazione Italiana di Linguistica Com-
Aba-Sah Dadzie, Daniel PreoÅčiuc-Pietro, Danica            putazionale (AILC).
  RadovanoviÄĞ, Amparo E. Cano Basave, and Ka-           Yaroslav Nechaev, Francesco Corcoglioniti, and Clau-
  trin Weller, editors. 2016. Proceedings of the 6th        dio Giuliano. 2016. Linking knowledge bases to
  Workshop on Making Sense of Microposts, volume            social media profiles.
  1691. CEUR.
                                                          Alessio Palmero Aprosio and Claudio Giuliano. 2016.
Xiaoqiang Luo. 2005. On coreference resolution per-         The Wiki Machine: an open source software for en-
  formance metrics. In Proceedings of the confer-           tity linking and enrichment. ArXiv e-prints.
Alessio Palmero Aprosio and Giovanni Moretti. 2016.
  Italy goes to Stanford: a collection of CoreNLP
  modules for Italian. ArXiv e-prints, September.
Giuseppe Rizzo, Marieke van Erp, Julien Plu, and
  Raphaël Troncy. 2016. Making Sense of Microp-
  osts (#Microposts2016) Named Entity rEcognition
  and Linking (NEEL) Challenge. In 6th Workshop
  on Making Sense of Microposts (#Microposts2016).
Pontus Stenetorp, Sampo Pyysalo, Goran Topić,
  Tomoko Ohta, Sophia Ananiadou, and Jun’ichi Tsu-
  jii. 2012. Brat: A web-based tool for nlp-assisted
  text annotation. In Proceedings of the Demonstra-
  tions at the 13th Conference of the European Chap-
  ter of the Association for Computational Linguistics,
  EACL ’12, pages 102–107, Stroudsburg, PA, USA.
  Association for Computational Linguistics.
Fabio Tamburini, Cristina Bosco, Alessandro Mazzei,
  and Andrea Bolioli. 2016. Overview of the
  EVALITA 2016 Part Of Speech on TWitter for ITAl-
  ian Task. In Pierpaolo Basile, Anna Corazza, Franco
  Cutugno, Simonetta Montemagni, Malvina Nissim,
  Viviana Patti, Giovanni Semeraro, and Rachele
  Sprugnoli, editors, Proceedings of Third Italian
  Conference on Computational Linguistics (CLiC-it
  2016) & Fifth Evaluation Campaign of Natural Lan-
  guage Processing and Speech Tools for Italian. Final
  Workshop (EVALITA 2016). Associazione Italiana di
  Linguistica Computazionale (AILC).
Eros Zanchetta and Marco Baroni. 2005. Morph-it!
  a free corpus-based morphological resource for the
  Italian language. Corpus Linguistics 2005, 1(1).